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Preface 


Linear algebra has in recent years become an essential part of the mathematical background required by 
mathematicians and mathematics teachers, engineers, computer scientists, physicists, economists, and 
statisticians, among others. This requirement reflects the importance and wide applications of the subject 
matter. 

This book is designed for use as a textbook for a formal course in linear algebra or as a supplement to all 
current standard texts. It aims to present an introduction to linear algebra which will be found helpful to all 
readers regardless of their fields of specification. More material has been included than can be covered in most 
first courses. This has been done to make the book more flexible, to provide a useful book of reference, and to 
stimulate further interest in the subject. 

Each chapter begins with clear statements of pertinent definitions, principles, and theorems together with 
illustrative and other descriptive material. This is followed by graded sets of solved and supplementary 
problems. The solved problems serve to illustrate and amplify the theory, and to provide the repetition of basic 
principles so vital to effective learning. Numerous proofs, especially those of all essential theorems, are 
included among the solved problems. The supplementary problems serve as a complete review of the material 
of each chapter. 

The first three chapters treat vectors in Euclidean space, matrix algebra, and systems of linear equations. 
These chapters provide the motivation and basic computational tools for the abstract investigations of vector 
spaces and linear mappings which follow. After chapters on inner product spaces and orthogonality and on 
determinants, there is a detailed discussion of eigenvalues and eigenvectors giving conditions for representing 
a linear operator by a diagonal matrix. This naturally leads to the study of various canonical forms, 
specifically, the triangular, Jordan, and rational canonical forms. Later chapters cover linear functions and 
the dual space V*, and bilinear, quadratic, and Hermitian forms. The last chapter treats linear operators on 
inner product spaces. 

The main changes in the fourth edition have been in the appendices. First of all, we have expanded 
Appendix A on the tensor and exterior products of vector spaces where we have now included proofs on the 
existence and uniqueness of such products. We also added appendices covering algebraic structures, including 
modules, and polynomials over a field. Appendix D, ‘‘Odds and Ends,” includes the Moore-Penrose 
generalized inverse which appears in various applications, such as statistics. There are also many additional 
solved and supplementary problems. 

Finally, we wish to thank the staff of the McGraw-Hill Schaum’s Outline Series, especially Charles Wall, 
for their unfailing cooperation. 


SEYMOUR LIPSCHUTZ 
MARC LARS LIPSON 
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Vectors in R” and C”, 
Spatial Vectors 


1.14 Introduction 


There are two ways to motivate the notion of a vector: one is by means of lists of numbers and subscripts, 
and the other is by means of certain objects in physics. We discuss these two ways below. 

Here we assume the reader is familiar with the elementary properties of the field of real numbers, 
denoted by R. On the other hand, we will review properties of the field of complex numbers, denoted by 
C. In the context of vectors, the elements of our number fields are called scalars. 

Although we will restrict ourselves in this chapter to vectors whose elements come from R and then 
from C, many of our operations also apply to vectors whose entries come from some arbitrary field K. 


Lists of Numbers 
Suppose the weights (in pounds) of eight students are listed as follows: 
156, 125, 145, 134, 178, 145, 162, 193 
One can denote all the values in the list using only one symbol, say w, but with different subscripts; that is, 
Wi, Wo, W3, W4, Ws, Wo, W7, We 
Observe that each subscript denotes the position of the value in the list. For example, 
w, = 156, the first number, w, = 125, the second number, ... 
Such a list of values, 
w = (W1, W2, W3,- , Wg) 


is called a linear array or vector. 


Vectors in Physics 


Many physical quantities, such as temperature and speed, possess only “magnitude.” These quantities 
can be represented by real numbers and are called scalars. On the other hand, there are also quantities, 
such as force and velocity, that possess both “magnitude” and “direction.” These quantities, which can 
be represented by arrows having appropriate lengths and directions and emanating from some given 
reference point O, are called vectors. 

Now we assume the reader is familiar with the space R? where all the points in space are represented 
by ordered triples of real numbers. Suppose the origin of the axes in R? is chosen as the reference point O 
for the vectors discussed above. Then every vector is uniquely determined by the coordinates of its 
endpoint, and vice versa. 

There are two important operations, vector addition and scalar multiplication, associated with vectors 
in physics. The definition of these operations and the relationship between these operations and the 


endpoints of the vectors are as follows. 
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(a+d, b+b, c+’) 
z H 


j (ka, kb, ke) 
ku 
© (a, b,c) 

7 (a, b, c) 
0 m= 
y y 

x 
(a) Vector Addition (b) Scalar Multiplication 


Figure 1-1 


(i) Vector Addition: The resultant u + v of two vectors u and v is obtained by the parallelogram law; 
that is, u + v is the diagonal of the parallelogram formed by u and v. Furthermore, if (a,b,c) and 
(a’, b,c’) are the endpoints of the vectors u and v, then (a+ a’, b+ b’, c+ c’) is the endpoint of the 
vector u + v. These properties are pictured in Fig. 1-1(a). 


(ii) Scalar Multiplication: The product ku of a vector u by a real number & is obtained by multiplying 
the magnitude of u by & and retaining the same direction if k > 0 or the opposite direction if k < 0. 
Also, if (a,b,c) is the endpoint of the vector u, then (ka, kb, kc) is the endpoint of the vector ku. 
These properties are pictured in Fig. 1-1(b). 


Mathematically, we identify the vector u with its (a,b,c) and write u = (a,b,c). Moreover, we call 
the ordered triple (a,b,c) of real numbers a point or vector depending upon its interpretation. We 
generalize this notion and call an n-tuple (a,,a),...,a,,) of real numbers a vector. However, special 


notation may be used for the vectors in R? called spatial vectors (Section 1.6). 


1.2 Vectors in R” 


The set of all n-tuples of real numbers, denoted by R”, is called n-space. A particular n-tuple in R”, say 
u = (A),4),..-,4,) 


is called a point or vector. The numbers a; are called the coordinates, components, entries, or elements 
of u. Moreover, when discussing the space R”, we use the term scalar for the elements of R. 

Two vectors, u and v, are equal, written u = v, if they have the same number of components and if the 
corresponding components are equal. Although the vectors (1,2,3) and (2,3, 1) contain the same three 
numbers, these vectors are not equal because corresponding entries are not equal. 

The vector (0,0,...,0) whose entries are all 0 is called the zero vector and is usually denoted by 0. 


EXAMPLE 1.1 


(a) The following are vectors: 
(2,-5), (7,9), (0,0,0), (3,4,5) 


The first two vectors belong to R?, whereas the last two belong to R°. The third is the zero vector in RÌ. 
(b) Find x,y,z such that (x — y, x+y, z— 1) = (4,2,3). 


By definition of equality of vectors, corresponding entries must be equal. Thus, 
x-y=4, x+y=2, z-1=3 


Solving the above system of equations yields x = 3, y= —1, z = 4. 
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Column Vectors 


Sometimes a vector in n-space R” is written vertically rather than horizontally. Such a vector is called a 
column vector, and, in this context, the horizontally written vectors in Example 1.1 are called row 
vectors. For example, the following are column vectors with 2,2,3, and 3 components, respectively: 


1 1.5 

1 3 5 2 
2 Fi —4 $ 7 3 
—6 -15 


We also note that any operation defined for row vectors is defined analogously for column vectors. 


1.3 Vector Addition and Scalar Multiplication 


Consider two vectors u and v in R”, say 
u = (di, d2, ..., a) and v = (bi, bz,..., bp) 

Their sum, written u + v, is the vector obtained by adding corresponding components from u and v. That is, 
u+ v= (a; +b, a + bz, ..., a, +b,) 


The scalar product or, simply, product, of the vector u by a real number k, written ku, is the vector 
obtained by multiplying each component of u by k. That is, 
ku = k(a,,ay,...,@,) = (ka), kaz, ... ,ka,) 
Observe that u+ v and ku are also vectors in R”. The sum of vectors with different numbers of 
components is not defined. 
Negatives and subtraction are defined in R” as follows: 


—u = (—l)u and u— v =u + (—v) 


The vector —u is called the negative of u, and u — v is called the difference of u and v. 
Now suppose we are given vectors u4, Uz,...,Um in R” and scalars k,,ky,...,k,, in R. We can 


multiply the vectors by the corresponding scalars and then add the resultant scalar products to form the 
vector 


v= kyu H kau H kzu3 perad kmum 
Such a vector v is called a linear combination of the vectors u), uz, ..., Um: 
EXAMPLE 1.2 


(a) Let u = (2,4, —5) and v = (1, —6,9). Then 


ai ae 5), —5 + 9) = (3,-1,4) 
= (7(2),7 (4 jet (= 5)) — (14, 28, —35) 
= (-1)(1, -6, 9) = (-1,6, —9) 
3u — 5v = (6, 12, —15) + (—5, 30, —45) = (1,42, —60) 
(b) The zero vector 0 = (0,0,...,0) in R” is similar to the scalar 0 in that, for any vector u = (a), a3, ..., an). 
ut+0=(a,+0, a +0, ..., a, +0) = (a),a),...,a,) =u 


2 
(c) Let u = | | and v = 
—4 


3 4 —9 =) 
—1 |. Then 2u — 3v = 6| + 3| = 9]. 
=2 —8 6 =2 
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Basic properties of vectors under the operations of vector addition and scalar multiplication are 
described in the following theorem. 


THEOREM 1.1: For any vectors u, v,w in R” and any scalars k, k’ in R, 


(i) (u+v)+w=u+(v+w), (v)  k(u+v)= ku+ kv, 


Gi) u+0=u, (vi)  (k+k)u= ku+ ku, 
(iii) u+ (—u)= 0, (vi) (kk’)u=k(k’u), 
(iv) u+v=v+u, (vii) lu=u. 


We postpone the proof of Theorem 1.1 until Chapter 2, where it appears in the context of matrices 
(Problem 2.3). 

Suppose u and v are vectors in R” for which u = kv for some nonzero scalar k in R. Then u is called a 
multiple of v. Also, u is said to be in the same or opposite direction as v according to whether k > 0 or 
k <0. 


1.4 Dot (Inner) Product 


Consider arbitrary vectors u and v in R”; say, 
u = (aj, d2,..., ap) and v = (b, bz,..., bp) 
The dot product or inner product or scalar product of u and v is denoted and defined by 
u: v= ab; + aby +: + abn 
That is, u: v is obtained by multiplying corresponding components and adding the resulting products. 
The vectors u and v are said to be orthogonal (or perpendicular) if their dot product is zero—that is, if 
u-v=0. 
EXAMPLE 1.3 
(a) Let u = (1,—2,3), v = (4,5,-1), w = (2,7,4). Then, 
u-v=1(4) —2(5)+3(-1) =4-10-3=-9 
u-w=2-14+12=0, v-w=8+35—-—4=39 


Thus, u and w are orthogonal. 


2 3 
(b) Letu= | 3| and v = [=i] tino e-3eenm 
—4 


(c) Suppose u = (1,2,3,4) and v = (6, k,—8,2). Find k so that u and v are orthogonal. 


First obtain w- v = 6+ 2k — 24+ 8 = —10 + 2k. Then set u- v = 0 and solve for k: 
—10+2k=0 or 2k = 10 or k=5 
Basic properties of the dot product in R” (proved in Problem 1.13) follow. 
THEOREM 1.2: For any vectors u,v, w in R” and any scalar k in R: 
© (u+tv)-w=u-wto-w, (iii) u-v=v-u, 


(ii) (ku)-v=k(u- v), (iv) uw-u>O0,andu-u=O0iffu=0. 


Note that (ii) says that we can “take k out” from the first position in an inner product. By (iii) and (ii), 
u- (kv) = (kv) -u = k(v - u) =k(u- v) 
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That is, we can also “take k out” from the second position in an inner product. 
The space R” with the above operations of vector addition, scalar multiplication, and dot product is 
usually called Euclidean n-space. 


Norm (Length) of a Vector 


The norm or length of a vector u in R”, denoted by ||u||, is defined to be the nonnegative square root of 
u-u. In particular, if u = (a,,a>,...,a,), then 


llul] = vu- u = \G4tG4+--+a 
That is, ||z|| is the square root of the sum of the squares of the components of u. Thus, ||z|| > 0, and 
||u|| = 0 if and only if u = 0. 
A vector u is called a unit vector if ||u|| = 1 or, equivalently, if u - u = 1. For any nonzero vector v in 
R”, the vector 


1 


U 
S gales SU 
loll lell 


is the unique unit vector in the same direction as v. The process of finding 6 from v is called normalizing v. 


v= 


EXAMPLE 1.4 


(a) Suppose u = (1,—2,—4,5,3). To find ||z||, we can first find lulj? = u-u by squaring each component of u and 
adding, as follows: 


lull? = 1? + (—2)? + (—4)? + 5+3? =1+4+16+25+9= 55 
Then ||z|| = v55. 
(b) Let v= (1,—3,4,2) and w = (5,—4,2,2). Then 


9 1 25 1 36 


Thus w is a unit vector, but v is not a unit vector. However, we can normalize v as follows: 


F v ( 1 —3 4 2 ) 
v= = ? ? ej 
lall (30° v30’ v30 v30 
This is the unique unit vector in the same direction as v. 


The following formula (proved in Problem 1.14) is known as the Schwarz inequality or Cauchy— 
Schwarz inequality. It is used in many branches of mathematics. 


THEOREM 1.3 (Schwarz): For any vectors u, v in R”, |u- v| < ||ul|||v]]. 


Using the above inequality, we also prove (Problem 1.15) the following result known as the “triangle 
inequality” or Minkowski’s inequality. 


THEOREM 1.4 (Minkowski): For any vectors u, v in R”, |u + v|| < |u|] + |] I]. 


Distance, Angles, Projections 


The distance between vectors u = (a), d7,...,a,) and v = (b;,b5,...,5,) in R” is denoted and defined 
by 


d(u, v) = ju — ol] = y (a1 — b1)? + (@ — by)? +++ + (an — bp)? 


One can show that this definition agrees with the usual notion of distance in the Euclidean plane R? or 
3 
space R’. 
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The angle 0 between nonzero vectors u,v in R” is defined by 


u: v 
cos 0 = 


luille 


This definition is well defined, because, by the Schwarz inequality (Theorem 1.3), 
u: v 
~ Ilellllell = 
Note that if u- v= 0, then 0 = 90° (or 0 = n/2). This then agrees with our previous definition of 
orthogonality. 


The projection of a vector u onto a nonzero vector v is the vector denoted and defined by 
. u: v u: v 
proj(u, v) = —; v = — v 
lloll uiw 
We show below that this agrees with the usual notion of vector projection in physics. 
EXAMPLE 1.5 
(a) Suppose u = (1, —2,3) and v = (2,4,5). Then 
d(u,v) = (1-2) + (-2-4)? +8- 5} = VIF 3674 = VAI 


To find cos 0, where 0 is the angle between u and v, we first find 


u-v=2-8415=9, lull? =1+4+9= 14, lul? = 4+ 16 +25 = 45 
Then 
cos 0 = a A 
luilell v14v45 
Also, 
proj») = Hro = 35 (24,5) = 5(2,4,5) = (55:1) 


(b) Consider the vectors u and v in Fig. 1-2(a) (with respective endpoints A and B). The (perpendicular) projection 
of u onto v is the vector u* with magnitude 
u: v u-v 
lu*l| = llul cos @ = [ul] = 7 
llelloll [loll 
To obtain u*, we multiply its magnitude by the unit vector in the direction of v, obtaining 
v u-v v u: v 


= = U 
of loll lloll jel? 


u* = ||u*|| i 


This is the same as the above definition of proj(u, v). 


A zA 
P(b,—a,, by—ay,b3-a3 ) 
B(b,, bo, b3) 
u A(aı, a, 43) 
0 = 
y 
x 
Projection u* of u onto v u=B—A 


(a) (b) 


Figure 1-2 
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1.5 Located Vectors, Hyperplanes, Lines, Curves in R” 


This section distinguishes between an n-tuple P(a;) = P(a,,a),...,a,) viewed as a point in R” and an 
n-tuple u = [c),C2,...,C,| viewed as a vector (arrow) from the origin O to the point C(c,,c2,...,C,). 


Located Vectors 
Any pair of points A(a;) and B(b;) in R” defines the located vector or directed line segment from A to B, 
— — 
written AB. We identify AB with the vector 
u=B—A=(b,—a,, b-a, ..., b, —a,] 


because AB and u have the same magnitude and direction. This is pictured in Fig. 1-2(b) for the 
points A(a,,a,,a3) and B(b,,b),b;) in R? and the vector «= B—A which has the endpoint 
P(b, — a, by — ay, bs — a3). 


Hyperplanes 
A hyperplane H in R” is the set of points (x,,x,,...,x,,) that satisfy a linear equation 
aixi + aX +++: + a,x, = D 
where the vector u = [a),a),...,a,]| of coefficients is not zero. Thus a hyperplane H in R’ isa line, anda 
hyperplane H in R° is a plane. We show below, as pictured in Fig. 1-3(a) for R, that u is orthogonal to 


any directed line segment PQ, where P(p;) and Q(q;) are points in H. [For this reason, we say that u is 
normal to H and that H is normal to u.] 


(b) 


(a) 
Figure 1-3 


Because P( p,;) and Q(q;) belong to H, they satisfy the above hyperplane equation—that is, 
Ay Py tapi +++ +a, Pn =b and aq, + arqa +`: + angr = b 


—> 
Let v= PQ =0Q-P-= [qı — Pi; Q2 = P2,- - -4n = Pn] 


Then 


u: v= a (qi — Py) + a(d — P) ++: + anlan — Pn) 
= (aqi + aq + +++ + andn) — (a1 Pi + a2 Pa + +++ + appr) =b- b= 


ae i 
Thus v = PQ is orthogonal to u, as claimed. 
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Lines in R” 


The Jine L in R” passing through the point P(b,,b5,...,5,) and in the direction of a nonzero vector 


es M 


u = [a),a>,...,a,] consists of the points X(x,,x2,...,X,) that satisfy 


X = at RI bi 
X=P+tu or aa Goer or L(t) = (at + 5;) 


where the parameter t takes on all real values. Such a line L in R? is pictured in Fig. 1-3(b). 


EXAMPLE 1.6 


(a) Let H be the plane in R? corresponding to the linear equation 2x — 5y + 7z = 4. Observe that P(1,1,1) and 
Q(5,4,2) are solutions of the equation. Thus P and Q and the directed line segment 


v= PO =Q-P=[5-1, 4-1, 2-1) = [4,3,1] 

lie on the plane H. The vector u = [2,—5,7] is normal to H, and, as expected, 
u- v= ([2,—5,7] - [4,3,1] =8-15+7=0 

That is, u is orthogonal to v. 


(b) Find an equation of the hyperplane H in Rf that passes through the point P(1,3,—4,2) and is normal to the 
vector u = [4, —2,5, 6]. 
The coefficients of the unknowns of an equation of H are the components of the normal vector u; hence, the 
equation of H must be of the form 


4x, — 2x) + 5x3 + 6x4 =k 
Substituting P into this equation, we obtain 


4(1) —2(3) + 5(—4) + 6(2)=k or 4-6-204+12=k or k=-10 


Thus, 4x, — 2x, + 5x; + 6x, = —10 is the equation of H. 


(c) Find the parametric representation of the line Z in R! passing through the point P(1,2,3,—4) and in the 
direction of u = [5,6, —7, 8]. Also, find the point Q on L when ż = 1. 
Substitution in the above equation for L yields the following parametric representation: 


x= S5t+l, xX, = 6t + 2, x3 = —7t + 3, x4 = 8t— 4 


or, equivalently, 


L(t) = (5t + 1,6t + 2, —7t + 3, 8t — 4) 


Note that ż = 0 yields the point P on L. Substitution of t = 1 yields the point Q(6, 8, —4, 4) on L. 


Curves in R” 


Let D be an interval (finite or infinite) on the real line R. A continuous function F: D — R” is a curve in 
R”. Thus, to each point ¢ € D there is assigned the following point in R”: 


F(t) = [F (t), F(t), wane , F, (£) 
Moreover, the derivative (if it exists) of F(t) yields the vector 


_ dF(t) _ |dF{(t) dF,(t) dF, (t) 
dt dt > d ° at 


V(t) 
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which is tangent to the curve. Normalizing V(t) yields 
_ Ve 
IVA 


Thus, T(t) is the unit tangent vector to the curve. (Unit vectors with geometrical significance are often 
presented in bold type.) 


T(¢) 


EXAMPLE 1.7 Consider the curve F(t) = [sin t, cos?, £] in R°. Taking the derivative of F(t) [or each component of 
F(t)] yields 


V(t) = [cost, — sinz, 1] 


which is a vector tangent to the curve. We normalize V(t). First we obtain 


IVO? = cos? t + sin? t+1=14+1=2 
Then the unit tangent vection T(t) to the curve follows: 


V(t) —sint 5 


TO = Ta (Va va A 


1.6 Vectors in R? (Spatial Vectors), ijk Notation 


Vectors in R?, called spatial vectors, appear in many applications, especially in physics. In fact, a special 
notation is frequently used for such vectors as follows: 


i = [1,0,0] denotes the unit vector in the x direction. 
j = [0, 1, 0] denotes the unit vector in the y direction. 
k = [0,0, 1] denotes the unit vector in the z direction. 
Then any vector u = [a,b,c] in R? can be expressed uniquely in the form 
u = [a,b,c] = ai + bj + cj 


Because the vectors i,j,k are unit vectors and are mutually orthogonal, we obtain the following dot 
products: 


i-i=1l, j-j=1, k-k=1 and i-j=0, i-k=0, j-k=0 


Furthermore, the vector operations discussed above may be expressed in the ijk notation as follows. 
Suppose 

u = a;i + aj+a3k and v = bii + b j + b3k 
Then 


u+ v= (a; +bi)i+ (a, + b)j + (a3 +b3)k and cu= cąji+ ca j+ cak 


where c is a scalar. Also, 


u: v = a;b; + ab + azb, and lull = vu -u = å + å + a 


EXAMPLE 1.8 Suppose u = 3i + 5j — 2k and v = 4i — 8j + 7k. 
(a) To find u + v, add corresponding components, obtaining u + v = 7i — 3j + 5k 
(b) To find 3u — 2v, first multiply by the scalars and then add: 

3u — 2v = (9i + 13j — 6k) + (—8i + 16j — 14k) = i + 29j — 20k 
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(c) To find w- v, multiply corresponding components and then add: 
u: v = 12 — 40 — 14 = —42 
(d) To find ||u||, take the square root of the sum of the squares of the components: 


lull = V9 +25 +4 = V38 


Cross Product 


There is a special operation for vectors u and v in R? that is not defined in R” for n 4 3. This operation is 
called the cross product and is denoted by u x v. One way to easily remember the formula for u x v is to 
use the determinant (of order two) and its negative, which are denoted and defined as follows: 


a b a b 


= d d = bc — ad 


| = ad — bc and — 


Here a and d are called the diagonal elements and b and c are the nondiagonal elements. Thus, the 
determinant is the product ad of the diagonal elements minus the product bc of the nondiagonal elements, 
but vice versa for the negative of the determinant. 

Now suppose u = a;i + aj + a3k and v = bi + bj + 63k. Then 


u x v= (ab; — a3by)i + (azb; — aıb3)j + (a,b — ab, )k 
-| a4 4 43 a4 a 4 a4 a a 
—| b b b, b b bi b, b 
That is, the three components of u x v are obtained from the array 
a a 4 
bi by b; 


(which contain the components of u above the component of v) as follows: 


i 


i+ 


(1) Cover the first column and take the determinant. 
(2) Cover the second column and take the negative of the determinant. 
(3) Cover the third column and take the determinant. 


Note that u x v is a vector; hence, u x v is also called the vector product or outer product of u 
and v. 
EXAMPLE 1.9 Find u x v where: (a) u = 4i + 3j + 6k, v = 2i + 5j — 3k, (b) u = [2,—1,5], v= [3,7,6]. 


(a) Use p ; E to getu x v = (—9 — 30)i + (12 + 12)j + (20 — 6)k = —39i + 24j + 14k 


2 
3 


(b) use | E ‘| to get wx = [-6—35,15— 12,1443] = [-41,3,17] 


Remark: The cross products of the vectors i,j,k are as follows: 
ix j=k, jx k=i, kxi=j 
jxi=-—k, k x j = -i, ixk=-j 


Thus, if we view the triple (i, j, k) as a cyclic permutation, where i follows k and hence k precedes i, then 
the product of two of them in the given direction is the third one, but the product of two of them in the 
opposite direction is the negative of the third one. 


Two important properties of the cross product are contained in the following theorem. 


CHAPTER 1 Vectors in R” and C”, Spatial Vectors ——— il 


Volume = u-v x w Complex plane 


(a) (b) 


Figure 1-4 


THEOREM 1.5: Let u, v,w be vectors in R°. 
(a) The vector u x v is orthogonal to both u and v. 


(b) The absolute value of the “triple product” 
u-uvuxwW 


represents the volume of the parallelopiped formed by the vectors u,v, w. 
[See Fig. 1-4(a).] 


We note that the vectors u, v, u x v form a right-handed system, and that the following formula 
gives the magnitude of u x v: 
llu x vl] = |lu[| 2] sin 0 


where 0 is the angle between u and v. 


1.7 Complex Numbers 


The set of complex numbers is denoted by C. Formally, a complex number is an ordered pair (a,b) of 
real numbers where equality, addition, and multiplication are defined as follows: 


(a,b) = (c,d) if and only if a = c and b = d 
(a,b) + (c,d) = (a+c, b+ d) 
(a,b) - (c,d) = (ac — bd, ad + bc) 
We identify the real number a with the complex number (a, 0); that is, 
a + (a,0) 


This is possible because the operations of addition and multiplication of real numbers are preserved under 
the correspondence; that is, 


(a,0) + (b,0)= (a+b, 0) and (a, 0) - (b,0) = (ab, 0) 


Thus we view R as a subset of C, and replace (a,0) by a whenever convenient and possible. 
We note that the set C of complex numbers with the above operations of addition and multiplication is 
a field of numbers, like the set R of real numbers and the set Q of rational numbers. 
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The complex number (0,1) is denoted by i. It has the important property that 
i = ii =(0,1)(0,1)=(-1,0)=-1 or i=v-1 
Accordingly, any complex number z = (a, b) can be written in the form 
z = (a,b) = (a,0) + (0,6) = (a,0) + (6,0) - (0,1) =a + bi 


The above notation z = a+ bi, where a = Re z and b = Im z are called, respectively, the real and 
imaginary parts of z, is more convenient than (a, b). In fact, the sum and product of complex numbers 
z =a + bi and w = c + di can be derived by simply using the commutative and distributive laws and 
2 
r=-l: 


z+w=(a+bi)+(c+di)=a+c+bi+di=(a+b)+(c+d)i 
zw = (a+ bi)(c + di) = ac + bei + adi + bd? = (ac — bd) + (be + ad)i 


We also define the negative of z and subtraction in C by 


-z= -lz and w-—z=w+(-z) 


Warning: The letter i representing v —1 has no relationship whatsoever to the vector i = [1, 0,0] in 
Section 1.6. 


Complex Conjugate, Absolute Value 


Consider a complex number z = a + bi. The conjugate of z is denoted and defined by 
Z=a+bi=a-bi 


Then zz = (a + bi)(a — bi) = aœ — b’ i? = a’ + b?. Note that z is real if and only if Z = z. 
The absolute value of z, denoted by |z|, is defined to be the nonnegative square root of zz. Namely, 


lz] = VZ = Va? + b2 


Note that |z| is equal to the norm of the vector (a,b) in R?. 
Suppose z Æ 0. Then the inverse z~! of z and division in C of w by z are given, respectively, by 


_ Z a b w wZ = 
l and Lay 


Zos = i 
Zz @+b &@+b? z zZ 


EXAMPLE 1.10 Suppose z = 2 + 3i and w = 5 — 2i. Then 


2+3i)+ (5-2) =2+5+3i—-2i=7+i 


Z=2+3i=2-3i and w=5—2i=542i 
w 5-21 (5—2i)(2-3i) 4-19 4 19, 
Z 


= = = = i 


243i (2+3)(2-3i) 3 B 13 
I| =vV4+9=vV13 and [w| =v25+4= v29 


Complex Plane 


Recall that the real numbers R can be represented by points on a line. Analogously, the complex numbers 
C can be represented by points in the plane. Specifically, we let the point (a, b) in the plane represent the 
complex number a + bi as shown in Fig. 1-4(b). In such a case, |z| is the distance from the origin O to the 
point z. The plane with this representation is called the complex plane, just like the line representing R is 
called the real line. 
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1.8 Vectors in C” 


The set of all n-tuples of complex numbers, denoted by C”, is called complex n-space. Just as in the real 
case, the elements of C” are called points or vectors, the elements of C are called scalars, and vector 
addition in C” and scalar multiplication on C” are given by 


[215 Z25 +++ yZn] + [w1 Was ++ Wal = Ei HW, Z2 + Wo, --- Zn + Wal 


BB Zyje sis Zy) = [Bey Bly, ees Z| 
where the z;, w;, and z belong to C. 
EXAMPLE 1.11 Consider vectors u = [2 + 3i, 4— i, 3] and v = [3 — 2i, 5i, 4 — 6i] in C?. Then 


u+ = [|2+3i, 4-i, 3]+[3-— 2i, 5i, 4-6] = [5+i, 4+4i, 7— 6i 
(5—2i)u = [(5—2)(2+3i, (5—2)(4- À, (5—2i)(3)] = [16+11i, 18— 13i, 15 — 6i 


Dot (Inner) Product in C” 


Consider vectors u = |z,,Z),...,Z,] and v = [w,,w,...,w,] in C”. The dot or inner product of u and v is 
denoted and defined by 
U: U= ZW] + ZW +°++ + ZW 


n n 


This definition reduces to the real case because w; = w; when w, is real. The norm of u is defined by 


lull = vu -u = 4/2121 + 2223 + ++- + ZnZn = Vial? + lz? $e onl? 
We emphasize that u - u and so ||u|| are real and positive when u 4 0 and 0 when u = 0. 


EXAMPLE 1.12 Consider vectors u = [2 + 3i, 4— i, 3 + 5i] and v = [3 — 4i, 5i, 4 — 2i] in C3. Then 


u: v = (2 +3i)(3 — 4i) + (4 — Ð (5) + (3 + 5i)(4 — 2i) 
= (2 + 3i)(3 + 4i) + (4 — i)(—Si) + (3 + 5i)(4 + 2i) 


= (—6 + 13i) + (—5 — 20i) + (2 + 26i) = —9 + 19i 
u-u=]|2+ 3i? +4- i? +3 +5i? = 4+9+16+1+9+25 = 64 
lul = v64 = 8 


The space C” with the above operations of vector addition, scalar multiplication, and dot product, is 
called complex Euclidean n-space. Theorem 1.2 for R” also holds for C” if we replace u - v = v - u by 


u-v=Uu:vU 


On the other hand, the Schwarz inequality (Theorem 1.3) and Minkowski’s inequality (Theorem 1.4) are 
true for C” with no changes. 


SOLVED PROBLEMS 


Vectors in R” 
1.1. Determine which of the following vectors are equal: 


u, = (1,2,3), u, = (2,3, 1), uz = (1,3,2), u4 = (2,3,1) 


Vectors are equal only when corresponding entries are equal; hence, only u, = uy. 
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1.2. Let u= (2,-7,1), v= (—3,0,4), w = (0,5, —8). Find: 


1.3. 


1.4. 


1.5. 


(a) 3u — 4v, 

(b) 2u +3v-— 5w. 

First perform the scalar multiplication and then the vector addition. 

(a) 3u — 4v = 3(2, —7, 1) — 4(—3, 0,4) = (6, —21,3) + (12,0, —16) = (18, —21, —13) 
(b) 2u + 3v -— 5w = (4, —-14, 2) + (—9,0, 12) + (0, —25, 40) = (—5, —39, 54) 


5 —1 3 
Let u = 3|, v= 5|,w= |-—!1 |. Find: 
—4 2 —2 
(a) S5u— 2v, 


(b) —2u + 4v — 3w. 


First perform the scalar multiplication and then the vector addition: 


5 =i 25 2 27 
(a) Su—2v=5| 3] -2] 5}=]| 15]+]-10o]} =] 5 
—4 2 —20 —4 —24 


—10 —4 —9 =93 
(b) —-2u+4v—3w=| -6| +] 20/+] 3) =] 17 
8 8 6 22 


Find x and y, where: (a) (x,3) = (2, x+y), (b) (4,y) = x(2,3). 
(a) Because the vectors are equal, set the corresponding entries equal to each other, yielding 
x=2, 3=x+y 


Solve the linear equations, obtaining x = 2, y = 1. 


(b) First multiply by the scalar x to obtain (4, y) = (2x, 3x). Then set corresponding entries equal to each 
other to obtain 


Solve the equations to yield x = 2, y= 6. 


Write the vector v = (1, —2,5) as a linear combination of the vectors u; = (1, 1,1), u = (1,2,3), 
Uz = (2,—1, 1). 


We want to express v in the form v = xu, + yu, + zu; with x,y,z as yet unknown. First we have 


1 1 1 2 x+ y+2z 
—2 | =x| 1| +y|2| +z|-1| = |x+2y- z 
5 1 3 1 x+3y+ z 


(It is more convenient to write vectors as columns than as rows when forming linear combinations.) Set 
corresponding entries equal to each other to obtain 


x+ y+2z= 1 x+ y+2z= 1 x+y+2z= 1 
x+2y— z= -2 or y—3z= -3 or y—3z= -3 
x+3y+ z= 5 2y- z= 4 5z= 10 


This unique solution of the triangular system is x = —6, y = 3,z = 2. Thus, v = —6u, + 3u, + 2u3. 
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1.6. Write v = (2,—5,3) as a linear combination of 


u = (1, —3, 2), uy = (2, —4, -1), u = (1,—5,7). 


Find the equivalent system of linear equations and then solve. First, 


2 1 2 1 x+2y+ z 
—5| =x}|-3} +y|—4| +2} -5] = | —3x — 4y — 5z 
3 2 si 7 2x— y+7z 
Set the corresponding entries equal to each other to obtain 
x+2y+ z= 2 x+2y+ z= 2 x+2y+ z=2 
3x — 4y — 5z = —5 or 2y—2z= 1 or 2y—2z=1 
2x— y+7z= 3 —5y+5z=-1 0=3 


The third equation, 0x + Oy + 0z = 3, indicates that the system has no solution. Thus, v cannot be written as 
a linear combination of the vectors u, uz, u3. 


Dot (Inner) Product, Orthogonality, Norm in R” 

1.7. Find u- v where: 
(a) u = (2,—5,6) and v = (8,2, —3), 
(b) u = (4,2, -3,5,-1) and v = (2,6, —1, —4,8). 
Multiply the corresponding components and add: 
(a) u- v= 2(8) — 5(2) + 6(—3) = 16 — 10 — 18 = -12 
(b) w-v=8+12+3-—20-8=-5 


1.8. Let u = (5,4,1), v = (3,—4,1), w = (1, —2,3). Which pair of vectors, if any, are perpendicular 
(orthogonal)? 


Find the dot product of each pair of vectors: 
u-v=15—-164+1=0, v-w=3+8+4+3= 14, u-w=5—-84+3=0 


Thus, u and v are orthogonal, u and w are orthogonal, but v and w are not. 


1.9. Find k so that u and v are orthogonal, where: 
(a) u = (1,4, —3) and v = (2, —5, 4), 
(b) u = (2,3k,—4,1,5) and v = (6, —1,3,7,2k). 
Compute u- v, set u - v equal to 0, and then solve for k: 
(a) u- v= 1(2) + k(—5) — 3(4) = —5k — 10. Then —5k — 10 = 0, or k = —2. 
(b) u: v=12—3k— 12+7+ 10k =7k+7. Then 7k +7 = 0, ork =—1. 


1.10. Find ||u||, where: (a) u = (3, —12, —4), (b) u= (2,—3,8,—7). 
First find llull? = u : u by squaring the entries and adding. Then ||u|| = Vla? 3 
(a) llul? = (3)? + (—12)° + (—4)? =9 + 144 + 16 = 169. Then ||u|| = V169 = 13. 
(b) llul? =4+9+64+ 49 = 126. Then |lu|| = V126. 
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1.11. 


1.12. 


1.13. 


Recall that normalizing a nonzero vector v means finding the unique unit vector v in the same 
direction as v, where 


1 
Tol v 
Normalize: (a) u= (3, —4), (b) v= (4,—2, —3, 8), (c) w=6,3 -D. 
(a) First find |jul| = v3 + 16 = V25 = 5. Then divide each entry of u by 5, obtaining ù = (3, — $). 
(b) Here ||u|] = V16 +4 +9 + 64 = V93. Then 


o= 


r ( 4 —2 -3 8 ) 
v= ? 7 ? 
V93 V93 V93 V93 
(c) Note that w and any positive multiple of w will have the same normalized form. Hence, first multiply w 
by 12 to “clear fractions”—that is, first find w' = 12w = (6,8, —3). Then 


A 6 8 <3 
| = V36 +64 +9 = V109 and w= = À , ) 
wl ane w=Ww \ 709’ 109’ V109 


Let u = (1, —3,4) and v = (3,4,7). Find: 

(a) cos 0, where 0 is the angle between u and v; 

(b) proj(u, v), the projection of u onto v; 

(c) d(u, v), the distance between u and v. 

First find w-v=3—124+28=19, |jul? =1+9+16=26, |u|? = 9+ 16 +49 = 74. Then 


ucv 19 
lulilo 26774 


u: v 19 57 76 133 57 38 133 
b j =—,v=—(3,4,7) = = 
(b) proj(w, v) ol 744 ade) ) ee 7) ee a) 


(c) d(u, v) = |ju — v|| = ||(—2, -7 — 3)|| = V4 + 49 +9 = v62. 


(a) cos? = 


Prove Theorem 1.2: For any u,v, w in R” and & in R: 


Gi) (u+v)-w=u-w+u-w, (ii) (ku): v=k(u- v), (ili) wu v=v-u, 
(iv) u-u > 0, andu-u=0iffu=0. 


Let u = (uj, U2, ..-, Up), U= (U1, V25.. -3 Un) W = (W1; W25- -3 Wp): 
(i) Because u + v = (wu, + v1, Uy +, ..., Up +V), 
(u + v) -w = (uy + 01)wi + (Ug + Uy) Wy + ->i + (tin + Un) Wp 
= UW, + VW + UW + +++ F UW, + UnWn 
= (t4 W1 + Uw, + +++ + Up Wy) + (Up + 0w + +++ + Wy) 


=u-wt+u-w 
(ii) Because ku = (ku, kuy,...,ku,), 
(ku) - v = kuv, + ky, +--+ + ku, v, = k(uyv, + ty) +++ + u,v,) = klu- v) 


(iii) u: VS UV, + Uv +++ + U,V, = VU + Ugly + +++ + UU, = VU 


(iv) Because u? is nonnegative for each i, and because the sum of nonnegative real numbers is nonnegative, 


u-u=uw +u +--+2>0 
Furthermore, u-u = 0 iff u; = 0 for each i, that is, iff u = 0. 
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1.14. Prove Theorem 1.3 (Schwarz): |u- v| < ||zl|||v]]. 
For any real number ¢, and using Theorem 1.2, we have 
O < (tu + v) - (tu + v) =P(u-u) + 2t(u-v) + (v-v) = ul? Ê + 2(u- v)t + Ilol? 


Let a = lul’, b=2(u-v), c= lol’. Then, for every value of ż, at? + bt + c > 0. This means that the 
quadratic polynomial cannot have two real roots. This implies that the discriminant D = b* — 4ac < 0 or, 
equivalently, b? < 4ac. Thus, 


2 Zi 112 
A(u-v)” < Alul lo 


Dividing by 4 gives us our result. 


1.15. Prove Theorem 1.4 (Minkowski): ||u + v|| < |lul| + |v]. 
By the Schwarz inequality and other properties of the dot product, 
2 2 2 2 
lu + oll” = (u+ v): (u + v) = (u-u) + 2(u - v) + (w: v) < [fell + 2|lelllloll + Holl” = (ell + Iel) 


Taking the square root of both sides yields the desired inequality. 


Points, Lines, Hyperplanes in R" 


Here we distinguish between an n-tuple P(a,,a),...,a,) viewed as a point in R” and an n-tuple 


3n 


u = |c], c3, . -C| viewed as a vector (arrow) from the origin O to the point C(c,,c),...,¢,). 


an 


1.16. Find the vector u identified with the directed line segment PO for the points: 


(a) P(1,—2,4) and Q(6,1,—5) in R°, (b) P(2,3,—6,5) and Q(7, 1,4, —8) in R4. 
(a) u=PO=0 P=(6 1, 1 ( 2), 5 4] = [5,3, -9] 
(b) w= PO =0=P=|7—2, 1=3, 4+6, —8—5] = [5, —2, 10, —13] 


1.17. Find an equation of the hyperplane H in R* that passes through P(3, —4, 1, —2) and is normal to 
u = (2,5, —6, —3]. 


The coefficients of the unknowns of an equation of H are the components of the normal vector u. Thus, an 
equation of H is of the form 2x, + 5x, — 6x3 — 3x4 = k. Substitute P into this equation to obtain k = —26. 
Thus, an equation of H is 2x, + 5x, — 6x3 — 3x4 = —26. 


1.18. Find an equation of the plane H in RÊ that contains P(1,—3, —4) and is parallel to the plane H’ 
determined by the equation 3x — 6y + 5z = 2. 


The planes H and H’ are parallel if and only if their normal directions are parallel or antiparallel (opposite 
direction). Hence, an equation of H is of the form 3x — 6y + 5z = k. Substitute P into this equation to obtain 
k = 1. Then an equation of H is 3x — 6y + 5z = 1. 


1.19. Find a parametric representation of the line Z in R passing through P(4, —2,3, 1) in the direction 
of u = [2, 5, —7, 8]. 


Here L consists of the points X (x;) that satisfy 
X=P+tu or =x, =at+5, or L(t) = (a,t+b;) 


where the parameter ¢ takes on all real values. Thus we obtain 


x) =442t, x» =—-242t, x =3-T7t xy=14+8t or L(t)=(442t, —242t, 3— 7t, 1+ 81) 


E CHAPTER 1 Vectors in R" and C", Spatial Vectors 


1.20. Let C be the curve F(t) = (Ê, 3t—2, P, 2 +5) in R*, where 0 < t < 4. 
(a) Find the point P on C corresponding to t = 2. 
(b) Find the initial point Q and terminal point Q’ of C. 
(c) Find the unit tangent vector T to the curve C when t = 2. 


(a) Substitute t = 2 into F(t) to get P = f (2) = (4,4,8,9). 

(b) The parameter ¢ ranges from t=0 to t=4. Hence, Q= f(0)= (0,—2,0,5) and 
Q' = F(4) = (16, 10, 64,21). 

(c) Take the derivative of F (t)—that is, of each component of F (t)—to obtain a vector V that is tangent to 
the curve: 


t 
V(t) =“ = (24,3, 37, 24] 
Now find V when t=2; that is, substitute t=2 in the equation for V(t) to obtain 
V = V(2) = [4,3, 12,4]. Then normalize V to obtain the desired unit tangent vector T. We have 


oe 3 12 4 
V185 v185 185/185 


IVI = v16+9+ 144+ 16= v185 and 


Spatial Vectors (Vectors in R*), ijk Notation, Cross Product 
1.21. Let u = 2i — 3j + 4k, v = 3i + j — 2k, w = i + 5j + 3k. Find: 
(a) u+ v, (b) 2u-— 3v + 4w, (c) u-v and u-w, (d)  |lw|| and |||. 
Treat the coefficients of i, j, k just like the components of a vector in R°. 
(a) Add corresponding coefficients to get u + v = 5i — 2j — 2k. 
(b) First perform the scalar multiplication and then the vector addition: 
2u — 3v + 4w = (4i — 6j + 8k) + (—9i + 3j + 6k) + (4i + 20j + 12k) 
= —i + 17j+ 26k 


(c) Multiply corresponding coefficients and then add: 
u-v=6-—3-8=-5 and u-w=2-—154+12=-1 
(d) The norm is the square root of the sum of the squares of the coefficients: 


lul = V4+9+16= v29 and |v) = V9+14+4=V14 


1.22. Find the (parametric) equation of the line L: 
(a) through the points P(1,3,2) and Q(2,5, —6); 


(b) containing the point P(1,—2,4) and perpendicular to the plane H given by the equation 
3x + 5y + 7z = 15. 


(a) First find v = PO = Q — P = [1,2, —8] = i + 2j — 8k. Then 
L(t) =(t+1, 2t+3, —8t + 2) = (t+ 1)i+ (2t + 3)j + (—8t + 2)k 


(b) Because L is perpendicular to H, the line L is in the same direction as the normal vector 
N = 3i + 5j + 7k to H. Thus, 


L(t) = (3t+ 1, 5t— 2, 7t+ 4) = (3t+ 1)i+ (5t — 2)j + (7t + 4k 


1.23. Let S be the surface xy? + 2yz = 16 in R°. 


(a) Find the normal vector N(x, y,z) to the surface S. 
(b) Find the tangent plane H to S at the point P(1,2,3). 
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1.24. 


1.25. 


1.26. 


1.27. 


1.28. 


(a) The formula for the normal vector to a surface F(x, y,z) = 0 is 
N(x, y,z) = Fi + F,j + Fk 
where F,, F,, F, are the partial derivatives. Using F (x,y,z) = xy? + 2yz — 16, we obtain 
F,=y, F, = 2xy + 2z, F,=2y 
Thus, N(x, y,z) = it (2xy + 2z)j + 2yk. 
(b) The normal to the surface S at the point P is 
N(P) = N(1,2,3) = 4i + 10j + 4k 


Hence, N = 2i + 5j + 2k is also normal to S at P. Thus an equation of H has the form 2x + 5y + 2z = c. 
Substitute P in this equation to obtain c = 18. Thus the tangent plane H to S at P is 2x + 5y + 2z = 18. 


Evaluate the following determinants and negative of determinants of order two: 


./3 4) 1. )2 =1 aak —5 
(a) Olz o| GI), 3p Gi], | 

ee 3 6l vee A =5 es A i 
(b) (i) -|; 2b (ii) -5 a) al a 
Use a a |= ad = be ana -|° 3 | = be — ad. Thus, 


(a) (i) 27-20 =7, (ii) 6+4 = 10, (iii) —8 + 15 =7. 
(b) (i) 24-6=18, (ii) —15 — 14 = —29, (iii) —8 + 12 = 4. 


Let u= 2i — 3j + 4k, v= 3i+j— 2k, w= i+ 5j + 3k. 
Find: (a) ux v, (b) uxw 


(a) Use E = E to getu x v= (6— 4)i+ (12+ 4j 


(2 + 9)k = 2i + 16j + 11k. 


(b) Use L E J to get ux w= (—9 — 20)i-+ (4— 6)j + (10 + 3)k = —291 — 2 + 13k. 


1 5 3 


Find u x v, where: (a) u= (1,2,3), v = (4,5,6); (b) u= (—4,7,3), v = (6, —5, 2). 


(a) Use k : | to get u x v = [12 — 15, 12—6, 5-— 8] = [-3,6, —3]. 


—4 7 3 
6 


(b) use | 5 2 


| to get u x v = [14+ 15, 18 +8, 20 -— 42] = [29, 26, —22]. 
Find a unit vector u orthogonal to v = [1,3,4] and w = [2, —6, —5]. 

First find v x w, which is orthogonal to v and w. 

The array E = E gives v x w = [-15 +24, 8+5, —6 — 61] = [9, 13, —12]. 


Normalize v x w to get u = [9/ V394, 13/V394, —12/v 394]. 


Let u = (a1, ay, a3) and v= (bi, bz, b3) SO UX V= (azb; = azb, azb; = a,b3, a,b, = ab). 
Prove: 


(a) u x v is orthogonal to u and v [Theorem 1.5(a)]. 
(b) lju x vl]? = (w-u)(v- v) — (u- v}? (Lagrange’s identity). 
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(a) We have 


u- (u X v) = a, (ab; — azb) + a (azb; — ayb3) + az (a;b — ab: ) 


= a, db; a,a3b - a7a3b, a, a,b; - a,a3b7 a7a3b, = 0 
Thus, u x v is orthogonal to u. Similarly, u x v is orthogonal to v. 
(b) We have 


2 2 2 2 
lu x ull” = (ab; — azb2)° + (azb, — a) b3)° + (aib — ab: ) (1) 
(u-u)(v-v) —(u- v)? = (aj + a3 + a3)(by + b3 + B3) — (aibi + aby + azb)? (2) 
Expansion of the right-hand sides of (1) and (2) establishes the identity. 


Complex Numbers, Vectors in C” 


1.29. 


1.30. 


1.31. 


1.32. 


1.33. 


1.34. 


Suppose z = 5 + 3i and w = 2 — 4i. Find: (a) z+w,(b) z—w,(c) zw. 

Use the ordinary rules of algebra together with 7 = —1 to obtain a result in the standard form a + bi. 
(a) z+w = (5 + 3i) + (2 — 4i) =7-i 

(b) z— w = (5 + 3i) — (2 — 4i) = 5 + 3i — 2 + 4i = 3 + 7i 

(c) zw = (5 + 3i)(2 — 4i) = 10 — 14i — 12i = 10 — 14i + 12 = 22 — 14i 


Simplify: (a) (5 +3Ò(2— 7i, (b) (4—31)*,(c) (1 +20. 
(a) (5+ 3i)(2 — 7i) = 10 + 6i — 35i — 217 = 31 — 29i 

(b) (4—3)? = 16 — 24i + 9P = 7 — 24i 

(c) (1+2)? = 1+6i+ 122 +87 = 1+ 6i- 12- 8i= -—11-— 2i 


Simplify: (a) ił, 8,14, (b) 15,16, 1,18, (0) P, i1, P3, PU., 
@) ®=1, P =P(i) = (-1)(ù = -i, *=(F)(7) 
©) =(= 0A =i É = (PP) = 0P) =P=, teres, Parei 
(c) Using? = 1 and ” = i+ = (i)i 


Il 
~ 
| 
pan 
a, 
x 
| 
an 
= 
lI 
= 


177” = i”, divide the exponent n by 4 to obtain the remainder r: 


: i 49: 3y ; : : : . j I3 
P = 408 — (PPP = PP =P = i, MsP s= i, P2 =P=], BT =j =i 


Find the complex conjugate of each of the following: 

(a) 6+4i, 7—5i, 4+i, —3 — i, (b) 6, —3, 4i, —9i. 
(a) 6+4i=6-—4i, 7-5i=7+5i, 441=4-i, -3-7 i=-—3+i 
(b) 6=6, —3 = —3, 4i = -—4i, —9i = 9i 


(Note that the conjugate of a real number is the original number, but the conjugate of a pure imaginary 
number is the negative of the original number.) 


Find zZ and |z| when z = 3 + 4i. 
For z = a + bi, use zZ = a? + b? and z = VZ = Væ + b. 
zZz = 9 + 16 = 25, e| = V25=5 

2-—T7i 
543i 
To simplify a fraction z/w of complex numbers, multiply both numerator and denominator by w, the 
conjugate of the denominator: 

228 @=WM6=3) =—fl=4 I! 4, 

543i (54+3/)(5—-3) 34 34 34 


Simpify 
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1.35. Prove: For any complex numbers z, w € C, (i) z+w=2Z+w, (ii) zw = Zw, (iii) Z=z. 


Suppose z = a + bi and w = c + di where a,b,c,d ER. 


(i) zFw= (a+bi)+(c+di) = (a+c)+(b+ di 
= (a+c)—(b+d)i = a+e—bi—di 
= (a— bi) + (c— di) = Z+ w 


Gi) zw= (a+ bi)(c+ di) = (ac — bd) + (ad + bc)i 
= (ac — bd) —(ad+bc)i = (a— bi)(e — di) = 


NI 


(iii) Z=a+bi=a—bi=a-—(-bji=a+bi=z 


1.36. Prove: For any complex numbers z, w € C, |zw| = |z||w. 
By (ii) of Problem 1.35, 
lw? = (ew) = ew) = (22)(ww) = Pl? 


The square root of both sides gives us the desired result. 


1.37. Prove: For any complex numbers z, w € C, |z + w| < |z| + |wI. 


Suppose z = a+ bi and w = c + di where a,b,c,d € R. Consider the vectors u = (a,b) and v = (c,d) in 
R°. Note that 


lz} = Va? +B? = |lu], lw] = Ve +a = [lol 


and 


e+ w| = |(a +c) + (b+d)i| = (atc) + (b+) =[(a+e,5 +4)|| = lu tol 
By Minkowski’s inequality (Problem 1.15), ||“ + v|| < |lul| + ||v|], and so 
z+ w| = lu + ol] < [ell + Moll = |z| + [w] 


1.38. Find the dot products u-v and v-u where: (a) u=(1—2i, 3+i), v= (442i, 5-— 6i), 
(b) u=(3—2i, 4i, 146i), v=(5+i, 2—3i, 7+24). 


Recall that conjugates of the second vector appear in the dot product 


(215+ -+52_) © (Wi, +++) Wp) = ZW, +--+ +2, Wn 
(a) u-v =(1—21)(44 2i) + (3 + 1)(5 — 6i) 
= (1 — 2i)(4 — 2i) + (3 + i)(5 + 6i) = —10i+9+23i = 9+ 13i 
v- u= (44 2i)(1 — 2i) + (5 — 6) (3 + i) 
= (4 + 2i)(1 + 2i) + (5 — 6i)(3 — i) = 10i +9- 23i = 9-— 13i 
(b) u- v= (3 — 2i)(5 +i) + (4i) (2 — 3i) + (1 + 6i)(7+2ì 
= (3 — 2i)(5 — i) + (4i) (2 + 3i) + (1 + 6i)(7 — 2i) = 204 35i 
v- u= (5+ i)(3 — 2i) + (2 — 3i)(4i) + (7+ 2i) (1 + 6i) 
= (5 + i)(3 + 2i) + (2 — 3i)(—4i) + (7 + 2i)(1 — 6i) = 20 — 35i 


In both cases, v- u = u-v. This holds true in general, as seen in Problem 1.40. 


1.39. Let u = (7 — 2i, 2 + 5i) and v = (1 +i, —3 — 6i). Find: 
(a) uty, (b) 2iu, (c) (3—<A)v, (d) u-v, (e) [jul] and |] ol]. 
(a) u+v=(7—-2i+1+i, 24+5i—3-6i) =(8—i, —1— i) 
(b) 2iu = (141-47, 41+ 107) = (4+ 14i, —10 + 4i) 
(c) (3-d)v=(34+3i-i-7, —9 — 18i + 31+ 67) = (442i, —15— 15i) 
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1.40. 


2+ 5i)(—3 — 6i) 
(2+ 5i)(-3+ 6i) = 5- 91-36-31 = —31 — 12i 


(d) u-v=(7—2i)(T +1) 
= (7 —2i)(1—i) 


©) llull = y7 + (—2)? + 22 + 5? = V32 and |u|] = y1? + 12 + (-3)? + (-6) = VT 


( 


| 

T 
| 
T 


Prove: For any vectors u,v € C” and any scalar z € C, (i) u- v = U-4G, (ii) (zu) - v = z(u- v), 
(iii) u - (zv) = Z(u- v). 
Suppose u = (Z1, Z2,- ., Zn) and v = (W1, W25... Wp): 


ISR 


(i) Using the properties of the conjugate, 


UU = WZ, + WZ) + +++ + WZ, = WZ) + WZ. + °° 


= WZ] + W229 Fo Wy Zy = ZiWi Zaw 1 
(ii) Because zu = (2z),22),...,2Z,), 
(zu) + v = 22;W, + ZZW +--+ + ZZ W, = 2(2;)W, + Z2W +--+ +2,W,) = 2Z(u-v) 


(Compare with Theorem 1.2 on vectors in R”.) 
(iii) Using (i) and (ii), 


SUPPLEMENTARY PROBLEMS 


Vectors in R” 


1.41. 


1.42. 


1.43. 


1.44. 


1.45. 


1.46. 


1.47. 


Let u = (1,—2,4), v = (3,5, 1), w = (2, 1, —3). Find: 


(a) 3u— 20; (b) 5u+3v—4w; (c) u-v, u-w, U-W; (d) llull, Joll; 
(e) cos, where 0 is the angle between u and v; (f) d(u,v); (g) proj(u, v). 


1 2 3 
Repeat Problem 1.41 for vectors u = | J v= ,w= 2] 


—4 


Let u = (2, —5,4, 6, —3) and v = (5, —2, 1, —7, —4). Find: 
(a) 4u-— 3v; (b) 5u+2v; (c) u-v, (d) |lul| and |lv||; (©) proj(u, v); (£) d(u, v). 


Normalize each vector: o 
1 
(a) u= (5,-7); (b) v= (1,2, —2,4); (c) w= -53 
Let u = (1,2, —2), v = (3, —12, 4), and k = —3. 
(a) Find [juli Ilol, [u+ vll Ilku]. 
(b) Verify that ||Az|| = |k||julļ| and ||u + o|] < [jul] + Ilol]. 


Find x and y where: 


(a) (x, y+1)=(y-2, 6; (b) x(2,y) = »x(1,—2). 


Find x,y,z where (x, y+ 1, y+z) = (2x +y, 4, 3z). 
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1.48. 


1.49. 


1.50. 


Write v = (2,5) as a linear combination of u; and u, where: 
(a) wu, = (1,2) and u = (3,5); 
(b) u; = (3,—4) and u = (2, —3). 


9 1 2 4 
Write v = | —3 | as a linear combination of u; = | 3}, u, = 5|, u= | —2 
16 3 -1 3 


Find k so that u and v are orthogonal, where: 
(a) u= (3,k,—2), v = (6, —4, —3); 

(b) u= (5,k,—4,2), v = (1, -3,2, 2k); 

(c) u=(1, 7, k+2, —2), v = (3,k,—3,k). 


Located Vectors, Hyperplanes, Lines in R” 


1.51. 


1.52. 


1.53. 


Find the vector v identified with the directed line segment PO for the points: 
(a) P(2,3,—7) and Q(1,—6, —5) in R’; 
(b) P(1,—8, —4,6) and Q(3, —5,2, —4) in R*. 


Find an equation of the hyperplane H in R that: 

(a) contains P(1,2,—3,2) and is normal to u = [2,3, —5, 6]; 

(b) contains P(3,—1,2,5) and is parallel to 2x, — 3x3 + 5x3 — 7x4 = 4. 
Find a parametric representation of the line in Rf that: 


(a) passes through the points P(1,2, 1,2) and Q(3, —5,7, —9); 
(b) passes through P(1, 1,3,3) and is perpendicular to the hyperplane 2x, + 4x, + 6x, — 8x, = 5. 


Spatial Vectors (Vectors in R), ijk Notation 


1.54. 


1.55. 


1.56. 


1.57. 


1.58. 


Given u= 3i—4j+2k, v=2i4+ 5j— 3k, w=4i4+ 7j+ 2k. Find: 
(a) 2u— 3v; (b) 3u+4v—2w; (c) u-v, Uw, UW; (d llull, loll, Iwll. 


Find the equation of the plane H: 

(a) with normal N = 3i — 4j + 5k and containing the point P(1,2, —3); 

(b) parallel to 4x + 3y — 2z = 11 and containing the point Q(2,—1,3). 
Find the (parametric) equation of the line L: 

(a) through the point P(2,5,—3) and in the direction of v = 4i — 5j + 7k; 
(b) perpendicular to the plane 2x — 3y + 7z = 4 and containing P(1,—5,7). 


Consider the following curve C in R? where 0<t<5: 
F(t) = Pi — Pj + (2t — 3)k 
(a) Find the point P on C corresponding to t = 2. 
(b) Find the initial point Q and the terminal point Q”. 
(c) Find the unit tangent vector T to the curve C when ¢ = 2. 
Consider a moving body B whose position at time ¢ is given by R(t) =i+fj+3¢k. [Then 


V(t) =dR(t)/dt and A(t) = dV(t)/dt denote, respectively, the velocity and acceleration of B.] When 
t = 1, find for the body B: 


(a) position; (b) velocity v; (c) speed s; (d) acceleration a. 
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1.59. Find a normal vector N and the tangent plane H to each surface at the given point: 


(a) surface x?y + 3yz = 20 and point P(1,3, 2); 
(b) surface x? + 3y? — 5z? = 160 and point P(3, —2, 1). 


Cross Product 


1.60. Evaluate the following determinants and negative of determinants of order two: 


2 5 3 6 =4 28 
@ Is éb |i -4t | 7 -3 

6 4 1 -3 8 -3 
(b) -1$ T -| 4)’ -|_§ =| 


1.61. Given u= 3i— 4j +2k, v= 2i+ 5j— 3k, w= 4i+ 7j+ 2k, find: 


(a) ux, (b) uxw, (co) vxw. 


1.62. Given u = [2,1,3], v = [4, —2, 2], w = [1, 1,5], find: 


(a) ux, (b) uxw, (c) vxw. 


1.63. Find the volume V of the parallelopiped formed by the vectors u, v, w appearing in: 
(a) Problem 1.60 (b) Problem 1.61. 


1.64. Find a unit vector u orthogonal to: 
(a) v= [1,2,3] and w = [1,—1, 2]; 
(b) v= 3i— j+ 2k and w = 4i — 2j — k. 


1.65. Prove the following properties of the cross product: 


(a) uxv=-(vx u) (d) ux (v+w)= (ux v)+ (uxw) 
(b) ux u= 0 for any vector u (e) (v+w)xu=(vxu)+(wx u) 
(c) (ku) x v= k(u x v) =u x (kv) (£) (ux v) x w= (u:w)w-— (v-w)u 


Complex Numbers 
1.66. Simplify: 


1 
(a) (4-7)(9+2); ©) (3-51); (©) a (d) 


1 2+3i 1 \? 
1.67. Simplify: (a) re (b) An (c) i!5, 5, i4; (d) ( J 


1.68. Letz = 2 — 5i and w = 7 + 3i. Find: 


(a) vt+w; ©) zw; © z/w; (®© 3w; (@ l, lwl. 


1.69. Show that for complex numbers z and w: 


(a) Rez=35(z+zZ), @) Imz=}(z-2, (c) zw=0 implies z = 0 or w = 0. 


Vectors in C” 
1.70. Letu = (1+7i, 2— 6i) and v = (5 — 2i, 3 — 4i). Find: 
(a) u+v (b) (3+i)ju (c) 2iu+(4+7i)v (d) u-v (e) lull and |]. 
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1.71. Prove: For any vectors u, v,w in C”: 


(a) (u+v)-w=u-wt+u-w, b) w- (u+v)=w:u+w:v. 


1.72. Prove that the norm in C” satisfies the following laws: 


[N,] For any vector u, ||z|| > 0; and ||x|| = 0 if and only if u = 0. 
[N3] For any vector u and complex number z, ||zz|| = |z|||x]. 


[N3] For any vectors u and v, |ju + v|| < |lul] + Ivl]. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


1.41. (a) (—3,—16,4); (b) (6,1,35); (c) —3,12,8; (d) v21, v35, v14; 
(@) —3/V21V35; (Ð VO; (g) -56851 =(-%-8-D 


1.42. (Column vectors) (a) (—1,7,—22); (b) (—1,26,—29); (c) —15,—27, 34; 
(d) v26, V30; (e) —15/(v26V30); (Ð v86; (@ -ğv=(-1,-},-5 


1.43. (a) (—13,—14,13,45,0); (b) (20,—29,22,16,—23); (e) —6; (d) v30, v95; 
(ec) -év (®© v167 


1.44. (a) (5/76, 9/V76); ©) (4, 2, —2, ; (© (6/v133, —4v133, 9133) 
1.45. (a) 3, 13, V120, 9 

1.46. (a) x= -—3, y= 5; (bì) x=0, y=0, and x=1, y=2 

1.47. x=-3, y=3, z=3 
1.48. (a) v= 5u — u; (b) v= lou, — 23u, 
1.49. v= 3u; — u, + 2u, 

1.50. (a) 6; (b) 3; (c) 3 


1.51. (a) v= |-1,-9,2]; (b) [2,3,6, —10] 


1.52. (a) 2x, + 3x) — 5x, + 6x4 =35; (b) 2x, — 3x) + 5x3 — 7x4 = —16 


1.53. (a) [2¢+1, —7t+2, 6f+1, —11t+2]; (b) [2t+1, 4t+1, 6t+3, —8t+ 3] 
1.54. (a) —23j+13k; (b) %91—6j—10k; (c) —20,—12,37; (d) V29,V38, 69 
1.55. (a) 3x-— 4y + 5z = —20; (b) 4x+3y-—2z=-1 


1.56. (a) [4t+2, —5t+5, 7-3}; (b) [Rt+1, -3¢-5, 7t+7] 


1.57. (a) P=F(2)=8i—4j+k; (b) Q= F(0)=-3k, O = F(5) = 125i — 25j + 7k; 
(c) T= (6i — 2j + k)/ v41 


1.58. (a) i+j+2k; (b) 2i+3j+2k; (©) v1ī7; (dd) 2i+ó6j 


1.59. (a) N=6i+7j+9k, 6x+7y+9z=45; (b) N= 6i- 12j— 10k, 3x— 6y— 5z= 16 


1.69. 


1.70. 


(d) 


—3,-6,26; (b) —2,-10,34 
2i+13j+23k; (b) —22i+2j+37k; (c) 3li— 16j-— 6k 

[5,8,—6]; () [2,-7,1; (©) [-7,-18,5] 

143; (b) 17 

(7,1,-3)/V59; (b) (5i+11j-— 2k)/v150 

50— 55i; (b) —16-30i; (© (4+7); dd (+3); (e) -2-2i 
-4i ©) (6+27); (©) —1,i—l; (d) 4(44+3)) 

9— 2i; (b) 29-29; (c) d(-1-41i); (d) 2+5i, 7-35 (©) v29, v58 
Hint: If zw = 0, then |zw| = |z||w| = |0| = 0 


(6+5i, 5— 10i); (b) (—4+22i, 12—16); (© (-8—41i, —4 — 33i); 
12+ 2i; (e) v90, v54 
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Algebra of Matrices 


2.1 Introduction 


This chapter investigates matrices and algebraic operations defined on them. These matrices may be 
viewed as rectangular arrays of elements where each entry depends on two subscripts (as compared with 
vectors, where each entry depended on only one subscript). Systems of linear equations and their 
solutions (Chapter 3) may be efficiently investigated using the language of matrices. Furthermore, certain 
abstract objects introduced in later chapters, such as ‘‘change of basis,’’ ‘‘linear transformations,” and 
‘quadratic forms,’’ can be represented by these matrices (rectangular arrays). On the other hand, the 
abstract treatment of linear algebra presented later on will give us new insight into the structure of these 
matrices. 

The entries in our matrices will come from some arbitrary, but fixed, field K. The elements of K are 
called numbers or scalars. Nothing essential is lost if the reader assumes that K is the real field R. 


2.2 Matrices 


A matrix A over a field K or, simply, a matrix A (when K is implicit) is a rectangular array of scalars 
usually presented in the following form: 


ay, a2 are ain 

a a a 
A= 21 22 2n 

Ami Am +++ Am 


The rows of such a matrix A are the m horizontal lists of scalars: 


(a11, a123- --3a1n), (a21, 422,- - - Arn), ie. (amis Am2- -> Amn) 


and the columns of A are the n vertical lists of scalars: 


ay a12 Ain 

a21 a2 An 
; yi. iag 

aml Am2 Amn 


Note that the element a,;, called the ij-entry or ij-element, appears in row i and column j. We frequently 
denote such a matrix by simply writing A = [a,). 

A matrix with m rows and n columns is called an m by n matrix, written m x n. The pair of numbers m 
and n is called the size of the matrix. Two matrices A and B are equal, written A = B, if they have the 
same size and if corresponding elements are equal. Thus, the equality of two m x n matrices is equivalent 
to a system of mn equalities, one for each corresponding pair of elements. 

A matrix with only one row is called a row matrix or row vector, and a matrix with only one column is 
called a column matrix or column vector. A matrix whose entries are all zero is called a zero matrix and 


will usually be denoted by 0. 
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Matrices whose entries are all real numbers are called real matrices and are said to be matrices over R. 
Analogously, matrices whose entries are all complex numbers are called complex matrices and are said to 
be matrices over C. This text will be mainly concerned with such real and complex matrices. 


EXAMPLE 2.1 


(a) The rectangular array A = : E ] is a 2 x 3 matrix. Its rows are (1, —4, 5) and (0,3, —2), 
and its columns are ~ 


lob [l La 


(b) The 2 x 4 zero matrix is the matrix 0 = k 
(c) Find x, y,z,t such that 


x+y 2z+t| |3 7 
x-y z-t}] |1 5 


By definition of equality of matrices, the four corresponding entries must be equal. Thus, 


x+y=3, x-y=l1, 2z+t=7, z—t=5 


Solving the above system of equations yields x = 2, y = 1, z = 4, t = —1. 


2.3 Matrix Addition and Scalar Multiplication 


Let A = |a,] and B = [b;] be two matrices with the same size, say m x n matrices. The sum of A and B, 


written A + B, is the matrix obtained by adding corresponding elements from A and B. That is, 


a +b atb ~- Antb 

a,+b an +b Le. a, +b 
A+B= 21 21 22 22 2n 2n 

amı T bmi An? T Dina e Amy T Diin 


The product of the matrix A by a scalar k, written k- 4 or simply kA, is the matrix obtained by 
multiplying each element of A by k. That is, 


kay, kay TE kay, 
kA = kay, kan Fara kazn 
kamı kam +--+ Ky, 


Observe that 4 + B and kA are also m x n matrices. We also define 
—A = (—1)A and A-B=A+(-B) 


The matrix —A is called the negative of the matrix A, and the matrix A — B is called the difference of A 
and B. The sum of matrices with different sizes is not defined. 
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EXAMPLE 2.2 Let A= | - | and B= | 2 5). Then 


1+4 -24+6 3+8 
0+1 4+(-3) 5+(-7) 


3(1) 3(-2) 3(3)] [3 -6 9 
P 3(4) o> f 12 i 


2 -4 6 —12 —18 —24 =10 —22 —18 
+ — 
0 8 10 —3 9 21 —3 17 31 


The matrix 24 — 3B is called a linear combination of A and B. 


| 


3A = 


2A — 3B = 


Basic properties of matrices under the operations of matrix addition and scalar multiplication follow. 


THEOREM 2.1: Consider any matrices A,B,C (with the same size) and any scalars k and k’. Then 


© (44+B)+C=A+(B+C), (v) k(A+B)= kA + kB, 


G) 4+0=0+4=A, (vi)  (k+k)A=k4+kKA, 
(iii) A+(-4)=(-4A)+4=0, (vii) (kk')A=k(k'A), 
(iv) A+B=B+A, (viii) 1-4 =A. 


Note first that the 0 in (ii) and (iii) refers to the zero matrix. Also, by (i) and (iv), any sum of matrices 
A, +Á +: An 


requires no parentheses, and the sum does not depend on the order of the matrices. Furthermore, using 
(vi) and (viii), we also have 


A+4=2A4, A+4+4=23A, 


and so on. 

The proof of Theorem 2.1 reduces to showing that the ij-entries on both sides of each matrix equation 
are equal. (See Problem 2.3.) 

Observe the similarity between Theorem 2.1 for matrices and Theorem 1.1 for vectors. In fact, the 
above operations for matrices may be viewed as generalizations of the corresponding operations for 
vectors. 


2.4 Summation Symbol 


Before we define matrix multiplication, it will be instructive to first introduce the summation symbol X 
(the Greek capital letter sigma). 
Suppose f(k) is an algebraic expression involving the letter k. Then the expression 


E(k) or equivalently XJ (k) 


has the following meaning. First we set k = 1 in f (k), obtaining 
f0) 

Then we set k = 2 in f(k), obtaining f (2), and add this to f(1), obtaining 
FU) +F(2) 


Ep»\__ CHAPTER 2 Algebra of Matrices 


Then we set k = 3 in f (k), obtaining f (3), and add this to the previous sum, obtaining 

FQ) +2) +6) 
We continue this process until we obtain the sum 

S) +S) meme AC.) 
Observe that at each step we increase the value of k by 1 until we reach n. The letter k is called the index, 
and | and n are called, respectively, the lower and upper limits. Other letters frequently used as indices 
are i and j. 


We also generalize our definition by allowing the sum to range from any integer n; to any integer n3. 
That is, we define 


EXAMPLE 2.3 


5 n 
(a) Dox, =x; +x +x +x4+xs and J ajb; = ajb; + ayby +--+ + Aqby 
k=1 i=1 


i= 


5 n : 
(b) 07 =274+374+4457=54 and Slax’ = apt axt ax +H a,x" 
J=2 i=0 
P 
(c) ay = ab); + agby + agby +--+ + aipbpj 


2.5 Matrix Multiplication 


The product of matrices A and B, written AB, is somewhat complicated. For this reason, we first begin 
with a special case. 

The product AB of a row matrix A = [a,] and a column matrix B = [b,| with the same number of 
elements is defined to be the scalar (or 1 x 1 matrix) obtained by multiplying corresponding entries and 
adding; that is, 


bı 
b n 
AB = |a, a, ..., an] 2 = a;b; + aby + +++ + abn = D> abk 
sa kel 
b 


n 


We emphasize that AB is a scalar (or a 1 x 1 matrix). The product AB is not defined when A and B have 
different numbers of elements. 


EXAMPLE 2.4 


3 
(a) [7,-4,5]| 2 | =7(3) + (—4)(2)+5(-1) =21-8-5=8 


(b) (6, —l, 8, 3] 


4 
-9 
9 | = 2449-16 +15 = 32 


5 


We are now ready to define matrix multiplication in general. 
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DEFINITION: Suppose A = [a] and B = [by] are matrices such that the number of columns of 4 is 
equal to the number of rows of B; say, A is an m x p matrix and B is a p x n matrix. 
Then the product AB is the m x n matrix whose ij-entry is obtained by multiplying the 
ith row of A by the jth column of B. That is, 


a, > Aip | [Ou .-- BM... bin Ciù === Cin 
aj aip = Cy 
amı amp b, 1 byj bpn Cm1 Cmn 


p 
where Cy = anby + anby + +++ + Aipdy = D1 andy 
k=l 


The product AB is not defined if A is an m x p matrix and B is a q x n matrix, where p £ q. 


EXAMPLE 2.5 


2 =l 


: 1 3 
(a) Find AB where A = | 5 2 6l 


JoeB 34 


Because A is 2 x 2 and B is 2 x 3, the product AB is defined and AB is a 2 x 3 matrix. To obtain 
the first row of the product matrix AB, multiply the first row [1, 3] of A by each column of B, 


sh La} Ve 


respectively. That is, 


AB= 


[2+15 0-6 frs] [17 —6 14] 


To obtain the second row of AB, multiply the second row [2,—1] of A by each column of B. Thus, 


17 —6 14 17 -6 14] 
4—5 042 -8-6 -1 2 -14| 


AB= 


3 4 0 -2 


5+0 6-4] f5 2 
15+0 18-8|7 |15 10 


(b) Suppose 4 = ; ‘| and B = f J Then 


as=| 


| ni mis 5+18 et i 


0—6 0-8 —6 -8 


The above example shows that matrix multiplication is not commutative—that is, in general, 
AB + BA. However, matrix multiplication does satisfy the following properties. 
THEOREM 2.2: Let 4,B,C be matrices. Then, whenever the products and sums are defined, 
(i) (AB)C = A(BC) (associative law), 
(ii) A(B + C) =AB+AC (left distributive law), 
(iii) (B+ C)A = BA + CA (right distributive law), 
(iv) k(AB) = (kA)B = A(KB), where k is a scalar. 


We note that 0A = 0 and BO = 0, where 0 is the zero matrix. 


ED CHAPTER 2 Algebra of Matrices 


2.6 Transpose of a Matrix 


The transpose of a matrix A, written A’, is the matrix obtained by writing the columns of A, in order, as 
rows. For example, 


T 1 4 1 
f; : | sizal ær Vee | aa 
T 5 


In other words, if A = [a;] is an m x n matrix, then A’ = [b,j] is the n x m matrix where bj = aj. 


Observe that the tranpose of a row vector is a column vector. Similarly, the transpose of a column 
vector is a row vector. 


The next theorem lists basic properties of the transpose operation. 


THEOREM 2.3: Let 4 and B be matrices and let k be a scalar. Then, whenever the sum and product are 


defined, 
© (44+B)" =AT+B’, (iii) (kA) = KAT, 
(ii) (A7)’ =A, (iv) (AB) = BTA, 


We emphasize that, by (iv), the transpose of a product is the product of the transposes, but in the 
reverse order. 


2.7 Square Matrices 


A square matrix is a matrix with the same number of rows as columns. An n x n square matrix is said to 
be of order n and is sometimes called an n-square matrix. 

Recall that not every two matrices can be added or multiplied. However, if we only consider square 
matrices of some given order n, then this inconvenience disappears. Specifically, the operations of 
addition, multiplication, scalar multiplication, and transpose can be performed on any n x n matrices, and 
the result is again an n x n matrix. 


EXAMPLE 2.6 The following are square matrices of order 3: 


1 2 3 2 —5 1 
A=|-4 -4 —4 and B=1|0 3-2 
5 6 7 1 2 —4 


The following are also matrices of order 3: 


3 -3 4 2 4 6 1 —4 5 
A+B= | -4 -1 -6], 2A4= | -8 -8 -8], AT=]|2 —4 6 
6 8 3 10 12 14 3 —4 7 

5 7 = 27 30 33 

AB = | -12 0 20], BA = | -22 —24 —26 

17 7 =35 —27 -30 —33 


Diagonal and Trace 


Let A = [a,j] be an n-square matrix. The diagonal or main diagonal of A consists of the elements with the 
same subscripts—that is, 


411, a22, 433, ingg ann 
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The trace of A, written tr(A), is the sum of the diagonal elements. Namely, 


tr(A) = ayy + an + a33 + +++ + apn 


The following theorem applies. 


THEOREM 2.4: Suppose A = [a;] and B = [bj] are n-square matrices and & is a scalar. Then 
(i) tr(A+B)=tr(A)+tr(B), (Gii) tr(A’) = tr(A), 
(ii) tr(kA) = k tr(A), (iv) tr(AB) = tr(BA). 


EXAMPLE 2.7 Let 4 and B be the matrices 4 and B in Example 2.6. Then 
diagonal of A = {1, —4,7} and = tr(A) =1—-4+7=4 
diagonal of B = {2,3, —4} and = tr(B) =2+3-4=1 


Moreover, 
tr(4 +B) =3-1+3=5, tr(24) =2-84+14=8, t(47)=1-4+4+7=4 
tr(AB) = 5 + 0 — 35 = —30, tr(BA) = 27 — 24 — 33 = —30 

As expected from Theorem 2.4, 
tr(A + B) = tr(A) + tr(B), tr(A7) = tr(A), tr(2A) = 2 tr(A) 


Furthermore, although AB + BA, the traces are equal. 


Identity Matrix, Scalar Matrices 


The n-square identity or unit matrix, denoted by J, or simply Z, is the n-square matrix with 1’s on the 
diagonal and 0’s elsewhere. The identity matrix Z is similar to the scalar 1 in that, for any n-square matrix 
A, 


AI=IA=A 


More generally, if B is an m x n matrix, then BJ, = 1,,B = B. 
For any scalar k, the matrix kI that contains k’s on the diagonal and 0’s elsewhere is called the scalar 
matrix corresponding to the scalar k. Observe that 


(KI)A = k(IA) = kA 
That is, multiplying a matrix A by the scalar matrix kI is equivalent to multiplying A by the scalar k. 


EXAMPLE 2.8 The following are the identity matrices of orders 3 and 4 and the corresponding scalar 
matrices for k = 5: 


1 0 0 
0 1 0j, 
0 0 1 1 5 


Remark 1: Itis common practice to omit blocks or patterns of 0’s when there is no ambiguity, as 
in the above second and fourth matrices. 


Remark 2: The Kronecker delta function ô; is defined by 


Thus, the identity matrix may be defined by J = [ô;;]. 
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2.8 Powers of Matrices, Polynomials in Matrices 


Let A be an n-square matrix over a field K. Powers of A are defined as follows: 
A? = AA, A? = AA, alas Atl = 44, me and = A°=] 


Polynomials in the matrix A are also defined. Specifically, for any polynomial 


fx) =a +a tae + tax 
where the a; are scalars in K, f(A) is defined to be the following matrix: 
f(A) = aol + a)A + aA? +--+ H aA" 


[Note that f(A) is obtained from f(x) by substituting the matrix A for the variable x and substituting the 
scalar matrix aZ for the scalar ag.] If f(A) is the zero matrix, then A is called a zero or root of f(x). 


EXAMPLE 2.9 Suppose 4 = ; a . Then 


2a [1 2]f1 2]_[ 7 -6] 3 p,._| 7 —-6]J1 2] _ f-11 38 
i =|; ll; l= a me =#a=] s IE l=] 57 e 
Suppose f(x) = 2x? — 3x + 5 and g(x) = x? + 3x — 10. Then 


aol 7 a aa a 6 18 
iar E a È 4 |)" F =l a A 


e4)=| 5 altal a l l= lo ol 


Thus, A is a zero of the polynomial g(x). 


2.9 Invertible (Nonsingular) Matrices 


A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that 
AB = BA =I 


where J is the identity matrix. Such a matrix B is unique. That is, if AB; = B,A = I and AB, = B,A =], 
then 


B; = Bl = B,(AB,) = (B,A)B, = IB, = B, 


We call such a matrix B the inverse of A and denote it by A~!. Observe that the above relation is 
symmetric; that is, if B is the inverse of A, then A is the inverse of B. 


EXAMPLE 2.10 Suppose that 4 = i | and B = E T3 | Then 
_ {6-5 —10+10| |1 0 = 6-5 15-15} |1 0 
calls eel i a BA=| 343 el, i 


Thus, A and B are inverses. 


It is known (Theorem 3.16) that AB = I if and only if BA = J. Thus, it is necessary to test only one 
product to determine whether or not two given matrices are inverses. (See Problem 2.17.) 
Now suppose A and B are invertible. Then AB is invertible and (4B) ' = B~!A~'. More generally, if 
A,,A,,...,A, are invertible, then their product is invertible and 
(dw) S42 ... A3 a 


the product of the inverses in the reverse order. 
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Inverse of a 2 x 2 Matrix 


Let A be an arbitrary 2 x 2 matrix, say A = f ‘| . We want to derive a formula for 4~', the inverse 
of A. Specifically, we seek 2? = 4 scalars, say x1, y1, X2, Y2, Such that 

a bj|x x| |1 0 oi axı +by, ax,+by,} |1 0 

c d||y yo} l0 1 cxi +dy, c&n +dya| |O 1 
Setting the four entries equal to the corresponding entries in the identity matrix yields four equations, 
which can be partitioned into two 2 x 2 systems as follows: 


axı + by, = 1, ax, + by, = 0 


cx, + dy, = 0, cx + dy, = 1 
Suppose we let |4| = ab — bc (called the determinant of A). Assuming |A| 4 0, we can solve uniquely for 
the above unknowns x4, y1, X2, Y2, obtaining 


od =C =b _ a 
“=a ae 2 TA)’ amr 
Accordingly, 


gyan fed _ f aA] —b/A] _ 1 [ d -b 
~ |e d| — |—c/|A| — a/|A| |AJ|-ce a 
In other words, when |A| Æ 0, the inverse of a 2 x 2 matrix A may be obtained from A as follows: 


(1) Interchange the two elements on the diagonal. 
(2) Take the negatives of the other two elements. 
(3) Multiply the resulting matrix by 1/|A| or, equivalently, divide each element by |A]. 


In case |A| = 0, the matrix A is not invertible. 


2 6 
First evaluate |4| = 2(5) — 3(4) = 10 — 12 = —2. Because |A| 4 0, the matrix A is invertible and 


1 5 -3 =S 3 
ela A = 2 2 
= -5l ] | 2 E 
6 


Now evaluate |B| = 1(6) — 3(2) = 


EXAMPLE 2.11 Find the inverse of A = F J and B = - al 


— 6 = 0. Because |B| = 0, the matrix B has no inverse. 


Remark: The above property that a matrix is invertible if and only if A has a nonzero determinant 
is true for square matrices of any order. (See Chapter 8.) 


Inverse of an n x n Matrix 


Suppose A is an arbitrary n-square matrix. Finding its inverse A7! reduces, as above, to finding the 
solution of a collection ofn x n systems of linear equations. The solution of such systems and an efficient 
way of solving such a collection of systems is treated in Chapter 3. 


2.10 Special Types of Square Matrices 


This section describes a number of special kinds of square matrices. 


Diagonal and Triangular Matrices 


A square matrix D = [d; is diagonal if its nondiagonal entries are all zero. Such a matrix is sometimes 
denoted by 


D= diag(d}, do, ops , dun) 
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where some or all the d; may be zero. For example, 


6 
M 4 0 0 
ae i 0 -5| —9 
0 02 


are diagonal matrices, which may be represented, respectively, by 
diag(3, —7, 2), diag(4, —5), diag(6, 0, —9, 8) 


(Observe that patterns of 0’s in the third matrix have been omitted.) 
A square matrix A = CA is upper triangular or simply triangular if all entries below the (main) 
diagonal are equal to 0—that is, if a; = 0 for i > j. Generic upper triangular matrices of orders 2, 3, 4 are 
as follows: l 


Ci} Ci2 Ci3 O14 
âi) 2 Pu Pin bn C22 C23 C24 
0 a ’ by by; , c c 

22 b33 33 a 


(As with diagonal matrices, it is common practice to omit patterns of 0’s.) 
The following theorem applies. 


THEOREM 2.5: Suppose 4 = [a;] and B = [b;] are n x n (upper) triangular matrices. Then 
(i) A+B, kA, AB are triangular with respective diagonals: 
(ay, + bir, TERA ann + bnn), (kay), ssi kann), (441511, panig annb) 


(ii) For any polynomial f(x), the matrix f(A) is triangular with diagonal 
(flan) f (22); Ses S Gand) 


(iii) A is invertible if and only if each diagonal element a; 4 0, and when A~! exists 
it is also triangular. 


A lower triangular matrix is a square matrix whose entries above the diagonal are all zero. We note 
that Theorem 2.5 is true if we replace ‘‘triangular’’ by either ‘‘lower triangular’’ or ‘‘diagonal.’’ 


Remark: A nonempty collection A of matrices is called an algebra (of matrices) if A is closed 
under the operations of matrix addition, scalar multiplication, and matrix multiplication. Clearly, the 
square matrices with a given order form an algebra of matrices, but so do the scalar, diagonal, triangular, 
and lower triangular matrices. 


Special Real Square Matrices: Symmetric, Orthogonal, Normal 
[Optional until Chapter 12] 


Suppose now A is a square matrix with real entries—that is, a real square matrix. The relationship 
between A and its transpose A’ yields important kinds of matrices. 


(a) Symmetric Matrices 


A matrix A is symmetric if AT = A. Equivalently, A = [a,] is symmetric if symmetric elements (mirror 
elements with respect to the diagonal) are equal—that is, if each aj = aj. 

A matrix A is skew-symmetric if A7 = —A or, equivalently, if each a; = —a;;. Clearly, the diagonal 
elements of such a matrix must be zero, because a; = —a,; implies a; = 0. 

(Note that a matrix A must be square if AT = A or A? = —A.) 
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23 5 0) 2s 100 
EXAMPLE 2.12 Let4=|-3 6 7|,B=|-3 0 5 c=) 0 At 
5 7 -8 4 -5 0 


(a) By inspection, the symmetric elements in A are equal, or A’ = A. Thus, A is symmetric. 


(b) The diagonal elements of B are 0 and symmetric elements are negatives of each other, or B? = —B. 
Thus, B is skew-symmetric. 


(c) Because C is not square, C is neither symmetric nor skew-symmetric. 


(b) Orthogonal Matrices 


A real matrix A is orthogonal if A’ = A~!—that is, if AAT = ATA = I. Thus, A must necessarily be 
square and invertible. 


EXAMPLE 2.13 Let 4 = . Multiplying A by A7 yields Z; that is, AA’ = I. This means 


\oloo WIR Ol 
Ol ol & \oloo 
O/B OIN Ole 


ATA =I, as well. Thus, A’ = A7!; that is, A is orthogonal. 


Now suppose 4 is a real orthogonal 3 x 3 matrix with rows 
u; = (41,43, 43), Uz = (b1, b2, b3), uz = (c1, C2, C3) 


Because A is orthogonal, we must have AA’ = I. Namely, 


a a a3|/}a, b c 1 0 0 
AA a by b b|la b ee) = | O 1 Olt 
Cy C2 C3 a3 b, C3 0 0 1 


Multiplying A by A’ and setting each entry equal to the corresponding entry in / yields the following nine 
equations: 


ú ++ =l, a,b; + ab, + a;b; = 0, aC + 42C) + 43C; = 0 
biai + bza, ae bza, = 0, b? I bs ale b? = l, bici bycy b3c3 =0 
cia; + c24, + €3a; = 0, cibi + Coby + c3b, = 0, j+G+g=1 


Accordingly, u; -uy = 1, u + u = 1, u3 -u3 = 1, and u;-u; = 0 for i #7. Thus, the rows uj, uz, uz are 
unit vectors and are orthogonal to each other. 


Generally speaking, vectors uj, u,...,U,, in R” are said to form an orthonormal set of vectors if the 
vectors are unit vectors and are orthogonal to each other; that is, 
os 0 ifi Fs 
oe \1 ifi=j 


In other words, u; - u; = ô; where 6; is the Kronecker delta function. 

We have shown that the condition 44’ = I implies that the rows of A form an orthonormal set of 
vectors. The condition ATA = I similarly implies that the columns of A also form an orthonormal set 
of vectors. Furthermore, because each step is reversible, the converse is true. 


The above results for 3 x 3 matrices are true in general. That is, the following theorem holds. 


THEOREM 2.6: Let A be a real matrix. Then the following are equivalent: 
(a) A is orthogonal. 
(b) The rows of A form an orthonormal set. 
(c) The columns of A form an orthonormal set. 


For n = 2, we have the following result (proved in Problem 2.28). 
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THEOREM 2.7: Let 4 be a real 2 x 2 orthogonal matrix. Then, for some real number 0, 


_ | cos sin = A= cos 0 sin 0 
~ |—=sin@ cos R ~ | sind —cos0 


(c) Normal Matrices 


A real matrix A is normal if it commutes with its transpose 4’—that is, if AA’ = A’ A. If A is symmetric, 
orthogonal, or skew-symmetric, then A is normal. There are also other normal matrices. 


EXAMPLE 2.14 Let 4 = ° a . Then 


r [e -3][ 6 3] [45 0 r, [ 6 3]f6 -3] [45 o 
a= lS Je S | and fa=| $ i ten as| 


Because AA? = A’ A, the matrix A is normal. 


2.11 Complex Matrices 


Let A be a complex matrix—that is, a matrix with complex entries. Recall (Section 1.7) that if z = a + bi 
is a complex number, then Z = a — bi is its conjugate. The conjugate of a complex matrix A, written A, is 
the matrix obtained from A by taking the conjugate of each entry in A. That is, if A = [a;], then A= [b;], 
where b; = 4. (We denote this fact by writing A = [@,].) 

The two operations of transpose and conjugation commute for any complex matrix A, and the special 
notation A” is used for the conjugate transpose of A. That is, 


At = (4) = (AP) 
Note that if A is real, then A” = A’. [Some texts use A* instead of A” .] 
948} 523 4271 ae a 


— H _ 13 " 
EXAMPLE 2.15 Let 4 = 6i (=a? 342i Then A” = | 5 3i 1+ 4i 
4+7i 3-2i 


Special Complex Matrices: Hermitian, Unitary, Normal [Optional until Chapter 12] 


Consider a complex matrix A. The relationship between A and its conjugate transpose A” yields 
important kinds of complex matrices (which are analogous to the kinds of real matrices described above). 


A complex matrix A is said to be Hermitian or skew-Hermitian according as to whether 
AT=A4A or A =A. 


Clearly, A = [ap] is Hermitian if and only if symmetric elements are conjugate—that is, if each 


a,; = 4;;—in which case each diagonal element a; must be real. Similarly, if A is skew-symmetric, 


then each diagonal element a; = 0. (Note that A must be square if A” = A or A” = —A.) 
A complex matrix A is unitary if A4A~! = A~!A” = J—that is, if 


A” =A. 


Thus, A must necessarily be square and invertible. We note that a complex matrix A is unitary if and only 
if its rows (columns) form an orthonormal set relative to the dot product of complex vectors. 


A complex matrix A is said to be normal if it commutes with 44—+that is, if 


AA? = AMA 
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(Thus, A must be a square matrix.) This definition reduces to that for real matrices when A is real. 


EXAMPLE 2.16 Consider the following complex matrices: 


3 1-2% 405; 1 =f pag 
A=|1+2i 4 2i B=; j 1 1+i c= [74% i 
4—-7i 2i 5 fae tae 0 7 i 


(a) By inspection, the diagonal elements of A are real, and the symmetric elements 1 — 2i and 1 + 2i are 
conjugate, 4 + 7i and 4 — 7i are conjugate, and —2i and 2i are conjugate. Thus, A is Hermitian. 

(b) Multiplying B by B” yields J; that is, BB” = I. This implies B’B = I, as well. Thus, B? = B`}, 
which means B is unitary. 


(c) To show C is normal, we evaluate CC” and CC: 


cow [2+3i 1 [2-3 -i ]_[ 14 4-4i 
“| i a+zļ|| 1 Daa, 7 |4+4i 6 
l4 4—4i 


PE Ha 
and similarly C” C = | 444i 6 


| . Because CC” = CĦC, the complex matrix C is normal. 


We note that when a matrix A is real, Hermitian is the same as symmetric, and unitary is the same as 
orthogonal. 


2.12 Block Matrices 


Using a system of horizontal and vertical (dashed) lines, we can partition a matrix A into submatrices 
called blocks (or cells) of A. Clearly a given matrix may be divided into blocks in different ways. For 
example, 


1-2; 0 1) 3 1 -21 0_1_3 1 -2 0'1 3 
2 _31_5_71-2 2 3ps 7T —2 2 3. Sila 
3 1145, 9p 3 1145 9 3 1 45 9 
4 61-3 11 8 4 6,-3 1 8 4 6 -3il 8 


The convenience of the partition of matrices, say A and B, into blocks is that the result of operations on A 
and B can be obtained by carrying out the computation with the blocks, just as if they were the actual 
elements of the matrices. This is illustrated below, where the notation A = [4;] will be used for a block 
matrix A with blocks A;,. 

Suppose that A = [A;;] and B = [B;] are block matrices with the same numbers of row and column 
blocks, and suppose that corresponding blocks have the same size. Then adding the corresponding blocks 
of A and B also adds the corresponding elements of A and B, and multiplying each block of A by a scalar 
k multiplies each element of A by k. Thus, 


Ay tBy AptBy .-- Ain +B, 
A+B- An + By, Ay tBy ... Ary + Bən 
Am +Bmt Ama Bm - Amn + Bun 
and 
kA; KAy ... KA, 


kA, kz cee kA), 


KA KAyy -kA 


mn 
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The case of matrix multiplication is less obvious, but still true. That is, suppose that U = [U,,] and 
V = [Vy] are block matrices such that the number of columns of each block U;, is equal to the number of 
rows of each block V;;. (Thus, each product U;,V;; is defined.) Then 


Wii Wiz Win 
W. W; W: 
Wmi W m2 W mn 


The proof of the above formula for UV is straightforward but detailed and lengthy. It is left as an exercise 
(Problem 2.85). 


Square Block Matrices 
Let M be a block matrix. Then M is called a square block matrix if 


(i) M is a square matrix. 
(ii) The blocks form a square matrix. 
(iii) The diagonal blocks are also square matrices. 


The latter two conditions will occur if and only if there are the same number of horizontal and vertical 
lines and they are placed symmetrically. 


Consider the following two block matrices: 


1 2i3 4,5 1 23 43 
1 111 101 11 10 
A=|9 8'7 615 and = B=|9 817 645 
4.414474 4 414 414 
3 5°39 5'3 3 So 55 


The block matrix A is not a square block matrix, because the second and third diagonal blocks are not 
square. On the other hand, the block matrix B is a square block matrix. 


Block Diagonal Matrices 


Let M = [4;] be a square block matrix such that the nondiagonal blocks are all zero matrices; that is, 
A; =0 when i#j. Then M is called a block diagonal matrix. We sometimes denote such a block 
diagonal matrix by writing 


M = diag(A,1,4o,.--, Ap) or M=A,, ®Ay O::: PA, 


The importance of block diagonal matrices is that the algebra of the block matrix is frequently reduced to 
the algebra of the individual blocks. Specifically, suppose f(x) is a polynomial and M is the above block 
diagonal matrix. Then f(M) is a block diagonal matrix, and 


fM) = diag( f (411), f (422), ee JS (Ar)) 


Also, M is invertible if and only if each 4; is invertible, and, in such a case, M7! is a block diagonal 
matrix, and 


M` = diag( ATi, A3, +- -3 Am) 


Analogously, a square block matrix is called a block upper triangular matrix if the blocks below the 
diagonal are zero matrices and a block lower triangular matrix if the blocks above the diagonal are zero 
matrices. 
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EXAMPLE 2.17 Determine which of the following square block matrices are upper diagonal, lower 


diagonal, or diagonal: 


rag Ee 
A=|]3 415], B= | 

0 0'6 5 0 6, 

l 017 8 


-i 


0 
0 
0 
9 


(a) A is upper triangular because the block below the diagonal is a zero block. 


(b) B is lower triangular because all blocks above the diagonal are zero blocks. 


(c) C is diagonal because the blocks above and below the diagonal are zero blocks. 


(d) D is neither upper triangular nor lower triangular. Also, no other partitioning of D will make it into 
either a block upper triangular matrix or a block lower triangular matrix. 


SOLVED PROBLEMS 


Matrix Addition and Scalar Multiplication 


1 =2 3 
4 5 —6 


(a) A+B, (b) 24-38. 


(a) Add the corresponding elements: 


2.1 Given 4 = | 


(b) First perform the scalar multiplication an 


2 —4 6 


pas E 10 —12 


| ants -f 


p 


3 0 2 

71 f fna 

-2+0 3+2] [4—2 5 
5+1 -64+8] |-3 6 2 


d then a matrix addition: 


a -s -l= 


21 —3 —24 


=7 
29 


—4 
7 


0 
—36 


| 


(Note that we multiply B by —3 and then add, rather than multiplying B by 3 and subtracting. This usually 


prevents errors.) 


2.2. Find x,y,z, t where 3 k d = E A + fe? val 
Write each side as a single equation: 
Ee 4 E | x+4 Ea 
3z 3t z+t—1 2t+3 
Set corresponding entries equal to each other to obtain the following system of four equations: 
3x =x+4, 3y=x+y+6, 37 =2+t—1, 3t = 2t+3 
or 24 = 4; 2y=6+x, az=t—1, t=3 
The solution is x = 2, y = 4, z = 1, t= 3. 


2.3. Prove Theorem 2.1 (i) and (v): (i) (A + 


B) +C =4A+ (B+C), (v) k(A+ B) = kA + kB. 


Suppose A = [a,], B = [b;], C = [c;]. The proof reduces to showing that corresponding ij-entries 
in each side of each matrix equation are equal. [We prove only (i) and (v), because the other parts 


of Theorem 2.1 are proved similarly. ] 
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(i) The j-entry of A + B is a;; + bj; hence, the ij-entry of (A + B) + C is (a; + bij) + cy. On the other hand, 
the jj-entry of B+C is b; + cj; hence, the ij-entry of A + (B + C) is a, + (bj; + c). However, for 


scalars in K, 
(az + by) + cj = aj + (by + cy) 


Thus, (A + B) + C and A + (B+ C) have identical ij-entries. Therefore, (4 + B) + C = A + (B + C). 

(v) The i-entry of A + B is a; + by; hence, k(a, + by) is the ij-entry of k(A + B). On the other hand, the ij- 
entries of kA and kB are ka, and kb, respectively. Thus, ka, + kb, is the ij-entry of kA + kB. However, 
for scalars in K, 


ij? 


k(a; + by) = kay + kby 
Thus, k(A + B) and kA + kB have identical ij-entries. Therefore, k(A + B) = kA + kB. 


Matrix Multiplication 


4 
3 a 5 
2.4. Calculate: (a) [8,—4,5]] 2], (b) [6,—1,7,5] -3 f> (c) [3,8,—2,4] | —1 
= 6 

2 


(a) Multiply the corresponding entries and add: 


3 
sas J = 8(3) + (—4)(2) +5(-1)=24-8-5= 11 
= 


(b) Multiply the corresponding entries and add: 
4 


~9 
[6,—1,7,5]| _, | =24+9—21+ 10 =22 
2 


(c) The product is not defined when the row matrix and the column matrix have different numbers of elements. 


2.5. Let (r x s) denote an r x s matrix. Find the sizes of those matrix products that are defined: 
(a) (2x 3)(3 x4), (c) (1x2)(3 x 1), (e) (4x 4)(3 x 3) 
(b) (4x 1)(1 x 2), (d) @x2)2 x3), (f) (2x2)(2x4) 


In each case, the product is defined if the inner numbers are equal, and then the product will have the size of 
the outer numbers in the given order. 


(a) 2x4, (c) not defined, (e) not defined 
(b) 4x2, (d 5x3, (€ 2x4 


1 3 2 0 —4 f 
2.6. Let A= È 3] and B = 3 2 F Find: (a) AB, (b) BA. 


(a) Because A is a 2 x 2 matrix and B a 2 x 3 matrix, the product AB is defined and is a 2 x 3 matrix. To 


obtain the entries in the first row of AB, multiply the first row [1,3] of A by the columns 


2 0 —4 ; : 
H ; E ; | i of B, respectively, as follows: 


zp - mm g E 2+ 98-6 -4+18]_ [11 -6 14 
=i2 -|E EE- = 
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To obtain the entries in the second row of AB, multiply the second row [2, —1] of A by the columns of B: 
1 2 — — 
AB = 3 0 —4 = 11 6 14 
2 HE E 4—3 0+2 -8-6 


a=] 


Thus, 


(b) The size of B is 2 x 3 and that of A is 2 x 2. The inner numbers 3 and 2 are not equal; hence, the product 
BA is not defined. 


2 3 2 -l1 0 6 
2.7. Find AB, where A = f > J and B= |1 3 -5 1 
4 1 -2 2 


Because A is a 2 x 3 matrix and B a3 x 4 matrix, the product AB is defined and is a 2 x 4 matrix. Multiply 
the rows of A by the columns of B to obtain 


4+3-4 2+9-1 0—15+2 124+3-2] | 3 6 —13 13 
8—2+20 -4—-64+5 0410-10 24—-2+10| |26 —5 0 32| 


2.8. Find: (a) E e (b) Ee i (c) 2-1] 3 I 


(a) The first factor is 2 x 2 and the second is 2 x 1, so the product is defined as a 2 x 1 matrix: 


1 6]{ 2] [2-42] [-40 
3 5 7| |635|  |—41 
(b) The product is not defined, because the first factor is 2 x 1 and the second factor is 2 x 2. 


(c) The first factor is 1 x 2 and the second factor is 2 x 2, so the product is defined as a 1 x 2 (row) matrix: 


2-11] 5 J = [2 +21, 12-— 35] = [23, —23] 


as=| 


2.9. Clearly, 0A = 0 and 40 = 0, where the 0’s are zero matrices (with possibly different sizes). Find 
matrices A and B with no zero entries such that 4B = 0. 


_|1l 2 S 6 2 _ {0 0 
Let 4=| |mas- 5 i]: Ten 48 = | l 


2.10. Prove Theorem 2.2(i): (4B)C = A(BC). 
Let A= [a;]), B= [bu], C= [cu], and let 4B =S = [si], BC =T = [tı]. Then 


Sik = D2 ajbi and tı = J Oe 
j=l k=l 
Multiplying S = AB by C, the i/-entry of (AB)C is 
Si Cy + SnCu FF Sin Cy = £ SikCkl = 2 2 (4b 4) Cu 
= == 
On the other hand, multiplying A by T = BC, the i/-entry of A(BC) is 


m m n 


ait + antu T+ F Gintn = 2 ayti = 2 2 a;j(b;kCr) 
j= jal k=l 


The above sums are equal; that is, corresponding elements in (4B)C and A(BC) are equal. Thus, 
(AB)C = A(BC). 
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2.11. Prove Theorem 2.2(ii): A(B + C) = AB+AC. 
Let A = [a,], B = [by], C = [cy], and let D = B + C = [dy], E = AB = [ey], F = AC = [f]. Then 


m m 


dig = bik + Cites eik = J ajbi Sik = Z WC 
j=l 1 


Thus, the ik-entry of the matrix AB + AC is 


m m m 


Cin + Sik = 2 ajb + 2 ajc = 2 ai Bj + Ce) 
On the other hand, the ik-entry of the matrix AD = A(B + C) is 

andik + andi ++ °° + Aim Ang = D3 ad, = 2 Ay(Dy + Cx) 
Thus, 4(B + C) = AB + AC, because the corresponding elements are equal. 


Transpose 


2.12. Find the transpose of each matrix: 


1 2 3 1 2 3 2 
a=; 8 a B= |2 4 5j, C = [1,-3,5,—-7], D=|-4 
3 5 6 6 
Rewrite the rows of each matrix as columns to obtain the transpose of the matrix: 
1 7 1 2 3 a 
AT=|-2 8], BT=|2 4 5], CT = aif D’ = (2, -4,6] 
3 —=9 3 5 6 7 


(Note that B7 = B; such a matrix is said to be symmetric. Note also that the transpose of the row vector C is a 
column vector, and the transpose of the column vector D is a row vector.) 


2.13. Prove Theorem 2.3(iv): (AB) = BTA. 
Let A = [a] and B = [by]. Then the ij-entry of AB is 
yD yj + Aiba; +++ + Gin Py 
This is the ji-entry (reverse order) of (AB). Now column j of B becomes row j of BT, and row i of A becomes 
column i of A’. Thus, the ij-entry of BTAT is 


T 
[bij bajs- -o OmjllQits an,- -s aim] = bijaa + byan + +++ + Bay Gin 
Thus, (AB)! = BTA" on because the corresponding entries are equal. 


Square Matrices 


2.14. Find the diagonal and trace of each matrix: 
1 3 6 2 4 8 l 2 3 
(ay) A=|2 -5 8], (b) B=| 3 7 9], © ea, = At 
4 -2 9 —5 0 2 
(a) The diagonal of A consists of the elements from the upper left corner of A to the lower right corner of A or, 
in other words, the elements a41, 422, 433. Thus, the diagonal of A consists of the numbers 1, —5, and 9. The 
trace of A is the sum of the diagonal elements. Thus, 
tr(A) =1-5+9=5 
(b) The diagonal of B consists of the numbers 2, —7, and 2. Hence, 
tr(B) =2-7+4+2=-3 


(c) The diagonal and trace are only defined for square matrices. 
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2.15. Let A = f ai and let f(x) = 2x3 — 4x + 5 and g(x) = x? + 2x + 11. Find 


(a) 4, 8, OFA) (d) (4). 
2244 [1 2] 2]_f1i+8 2-6]_ J 9 —4 
oa =44=| 5 el Blac fe ae A 
s 4p [1 2]f 9 -4]_[ 9-16 -4+34]_[-7 30 
eer =| ai ae Selle | 
(c) First substitute A for x and 5/ for the constant in f(x), obtaining 
fla) = 24? 44-4 51 =2( ali leslo | 


Now perform the scalar multiplication and then the matrix addition: 
—14 60 —4 -8 5 0 —13 52 
f(A) = | 120 AA] i E = k | E | 104 E 
(d) Substitute 4 for x and 117 for the constant in g(x), and then calculate as follows: 


(4) = 4 424-111=| i utl d-ek | 
—8 17 4. =3 0 1 
9 —4 2 4 -11 0 0 0 
alee lt ls lel 0 ie H 
Because g(A) is the zero matrix, A is a root of the polynomial g(x). 


1 3 
4 -3 
(b) Describe all such vectors. 


2.16. Let aal | (a) Find a nonzero column vector u= H such that Au = 3u. 


(a) First set up the matrix equation Au = 3u, and then write each side as a single matrix (column vector) as 


follows: 
1 3 alk x+3y | | 3x 
F HIHOH] ai p 


Set the corresponding elements equal to each other to obtain a system of equations: 


x+ 3y = 3x öt 2x—3y=0 

4x — 3y = 3y 4x—6y=0 
The system reduces to one nondegenerate linear equation in two unknowns, and so has an infinite number 
of solutions. To obtain a nonzero solution, let, say, y = 2; then x = 3. Thus, u = (3,2)" is a desired 
nonzero vector. 


or 2x —3y = 0 


(b) To find the general solution, set y = a, where a is a parameter. Substitute y = a into 2x — 3y = 0 to obtain 
x =3a. Thus, u = (3a,a)’ represents all such solutions. 


Invertible Matrices, Inverses 


1 0 2 —11 2 2 
2.17. Show that 4 = |2 —1 3] and B= —4 0 1 | are inverses. 
4 1 8 6 —1 -l 


Compute the product AB, obtaining 


SOS 240-2 2402 1 0 
AB= | -224+44+18 440-3 4-1-3]=]0 1 
—44—4+48 SEO 8 8+1-8 0 0 


Because AB = J, we can conclude (Theorem 3.16) that BA = I. Accordingly, A and B are inverses. 


=I 


=. OO 
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2.18. 


2.19. 


2.20. 


Find the inverse, if possible, of each matrix: 


5 3 2 -3 —2 6 
(a) ‘=l; Al (b) a= | 1 I (©) | 5}: 


Use the formula for the inverse of a 2 x 2 matrix appearing in Section 2.9. 


(a) First find |4| = 5(2) — 3(4) = 10 — 12 = —2. Next interchange the diagonal elements, take the negatives 
of the nondiagonal elements, and multiply by 1/|A|: 


gaa If 2-3] [= 3 
~ 2ļ|=4 5; 2 -5 


(b) First find |B| = 2(3) — (—3)(1) = 6 + 3 = 9. Next interchange the diagonal elements, take the negatives 
of the nondiagonal elements, and multiply by 1/|B|: 


(c) First find |C| = —2(—9) — 6(3) = 18 — 18 = 0. Because |C| = 0, C has no inverse. 


111 
Xi) N X3 
LettA=|0 1 2ļ|.Find A! = |y) y y; 
124 Z2 22 43 


Multiplying A by A7! and setting the nine entries equal to the nine entries of the identity matrix J yields the 
following three systems of three equations in three of the unknowns: 


yt yt y= 1 y+ Yt 2=0 x3 + y3 + z =0 
yı + 2z, =0 Yq + 22) = 1 Y3 + 2z; =0 
x, + 2y, +47 =0 xX, + 2y, + 4z, = 0 x3 + 2y3 + 4z; = 1 


[Note that A is the coefficient matrix for all three systems.] 
Solving the three systems for the nine unknowns yields 


x, =0, y=2, 2,=-1; y= -2, y=3, y= 1; %3=1, y3=-2, z =l 


0-2 1 
Thus, Alt=| 2 3 -2 
-1 -1 1 


(Remark: Chapter 3 gives an efficient way to solve the three systems.) 


Let A and B be invertible matrices (with the same size). Show that AB is also invertible and 
(AB)! = B-'4-, [Thus, by induction, (4143... Ap) = A}! ... A347] 


Using the associativity of matrix multiplication, we get 


(AB)(B"'47') = A(BB"')A7! = AIA“! = AA = 1 
(B'A!) (4B) = B'(A'A)B=A'IB=B'B=I 


Thus, (4B)! = B-!47!, 
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Diagonal and Triangular Matrices 
2.21. Write out the diagonal matrices A = diag(4, —3, 7), B = diag(2, —6), C = diag(3, —8, 0, 5). 


Put the given scalars on the diagonal and 0’s elsewhere: 


2.22. Let A = diag(2,3,5) and B = diag(7, 0, —4). Find 
(a) AB, A’, B?; (b) f(A), where f(x) = x? + 3x — 2; (c) A`! and B~}. 


(a) The product matrix AB is a diagonal matrix obtained by multiplying corresponding diagonal entries; hence, 


AB = diag(2(7), 3(0), 5(—4)) = diag(14, 0, —20) 
Thus, the squares A? and B? are obtained by squaring each diagonal entry; hence, 
A’ = diag(2*, 3°, 5°) = diag(4,9,25) and B? = diag(49, 0, 16) 
(b) f(A) is a diagonal matrix obtained by evaluating f(x) at each diagonal entry. We have 
f(2)=4+6-2=8, f(3) =9+9-2= 16, (5) = 254+ 15 —2 = 38 
Thus, f(A) = diag(8, 16, 38). 


(c) The inverse of a diagonal matrix is a diagonal matrix obtained by taking the inverse (reciprocal) 


1 1 1 


of each diagonal entry. Thus, 4~! = diag(;,;,;), but B has no inverse because there is a 0 on the 


diagonal. 


2.23. Find a 2 x 2 matrix A such that A? is diagonal but not A. 


7 0 


_ fl 2 By 
Leta =| | Then 4 a 7 


3 l | which is diagonal. 


8 —57 
0 27]° 


2.24. Find an upper triangular matrix A such that 4* = | 


Set A = k Ar Then x? = 8, so x = 2; and z = 27, so z = 3. Next calculate A? using x = 2 and y = 3: 


2 |2 y||2 yl] |4 Sy 3 {2 y||4 5| 18 19y 
ea); Al Je 4 di eal, ale gle; A 


Thus, 19y = —57, or y = —3. Accordingly, A = & a 


2.25. Let A = [a;] and B = [b;] be upper triangular matrices. Prove that AB is upper triangular with 


diagonal 1,511, &22b22, - - - , AnyOnn- 


Let AB = [c,]. Then cj = X 5p-1 digby and ci = 24) libri- Suppose i > j. Then, for any k, either i > k or 
k >j, so that either a, = 0 or b, = 0. Thus, c; = 0, and AB is upper triangular. Suppose i = j. Then, for 
k <i, we have a, = 0; and, for k > i, we have b; = 0. Hence, c; = a,;b,;, as claimed. [This proves one part of 


Theorem 2.5(i); the statements for 4 + B and kA are left as exercises.] 
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Special Real Matrices: Symmetric and Orthogonal 


2.26. Determine whether or not each of the following matrices is symmetric—that is, AT = A—or 


2.27. 


2.28. 


2.29. 


2.30. 


skew-symmetric—that is, A’ = —A: 
5 =] 1 0 4 -3 00 0 
(a) A=] -7 8 2], (b) B=|]-4 0 5], (c) c=) 0 4 
1 2 —4 3 = 0 


(a) By inspection, the symmetric elements (mirror images in the diagonal) are —7 and —7, 1 and 1, 2 and 2. 
Thus, Æ is symmetric, because symmetric elements are equal. 


(b) By inspection, the diagonal elements are all 0, and the symmetric elements, 4 and —4, —3 and 3, and 5 and 
—5, are negatives of each other. Hence, B is skew-symmetric. 


(c) Because C is not square, C is neither symmetric nor skew-symmetric. 


4 x+2 
2x-3 x+1 

Set the symmetric elements x +2 and 2x — 3 equal to each other, obtaining 2x — 3 =x+2 or x= 5. 
Hence, B = p ak 


Suppose B = | 


| is symmetric. Find x and B. 


7 6 


Let A be an arbitrary 2 x 2 (real) orthogonal matrix. 
(a) Prove: If (a,b) is the first row of A, then a? + b? = 1 and 


a b a b 
4=| § s or ial, = 


(b) Prove Theorem 2.7: For some real number 0, 


ee cos@ sin . = cos 0 sin 0 
~ |—sin@ cos o ~ | sind —cos0 


(a) Suppose (x,y) is the second row of A. Because the rows of A form an orthonormal set, we get 


a+h=1, r+y=1, ax + by =0 
Similarly, the columns form an orthogonal set, so 

a+r =l, +y =l, ab+xy=0 
Therefore, x? = 1 — a? = b?, whence x = +b. 


Case (i): x = b. Then b(a +y) = 0, so y = —a. 
Case (ii): x = —b. Then b(y — a) = 0, soy =a. 


This means, as claimed, 
a b a b 
A= E A or A= : 3 


(b) Because a? + b? = 1, we have —1 < a < 1. Leta = cos 0. Then b? = 1 — cos? 0, so b = sin 0. This proves 
the theorem. 


Find a 2 x 2 orthogonal matrix A whose first row is a (positive) multiple of (3,4). 
Normalize (3,4) to get (3,4). Then, by Problem 2.28, 


575. 
| o 


Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of u; = (1,1,1) and 
uy = (0,—1, 1), respectively. (Note that, as required, u; and u, are orthogonal.) 


vlw Uni 


a 
II 
C | 


Ul vw 
vlw nl 
A) 


wl vw 
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First find a nonzero vector u, orthogonal to u; and wy; say (cross product) u; = uy X u = (2, —1,—1). Let A be 
the matrix whose rows are i, uz, u3; and let P be the matrix obtained from A by normalizing the rows of A. Thus, 


I 1 ıl 1/V3 1/73 1/73 
A=|0 -1 1 and P=| 0 -1/V¥2 1/2 
Boat) SP 2/V6 -1/V6 —1/V6 
Complex Matrices: Hermitian and Unitary Matrices 
; ; 2-31 548i 
Jai tude wee Gy 2S\° a yas! ab SF 
6+7i 148i 6 —i si 


Recall that 4” = A’, the conjugate tranpose of A. Thus, 
345i 6-T7i 2+ 3i —4 —6 +i 
F H 
(ia a a O) a aa 347i Si 


2, 2; 
— 3! 31 : $ 
4 i _| 1s unitary. 


1 
2.32. Show that A = | i 
gh 3 3! 


The rows of A form an orthonormal set: 


12,2) (1_2,2)_ (1,4), 4_, 
3 33) GB 33) T Ce) T97 


Thus, Æ is unitary. 


2.33. Prove the complex analogue of Theorem 2.6: Let A be a complex matrix. Then the following are 
equivalent: (i) A is unitary. (ii) The rows of A form an orthonormal set. (iii) The columns of A 
form an orthonormal set. 

(The proof is almost identical to the proof on page 37 for the case when A is a3 x 3 real matrix.) 
First recall that the vectors u,uz,...,u, in C” form an orthonormal set if they are unit vectors and are 
orthogonal to each other, where the dot product in C” is defined by 


(@1,4,.--,@y) j (bi, by,..-, bn) =a;b; + dyby +-+- + a,b, 


Suppose A is unitary, and R,,R),...,R, are its rows. Then R7,RJ,...,R7 are the columns of A”. Let 
AA” = [c;]. By matrix multiplication, c = R;R? = R;- R}. Because A is unitary, we have AA” = I. Multi- 
plying A by A” and setting each entry c; equal to the corresponding entry in J yields the following n? 


ij 
equations: 
Rem Sl, BH, ..., R, R,=1, and Ro-R=O, dor i7j 


Thus, the rows of A are unit vectors and are orthogonal to each other; hence, they form an orthonormal set of 
vectors. The condition ATA = J similarly shows that the columns of A also form an orthonormal set of vectors. 
Furthermore, because each step is reversible, the converse is true. This proves the theorem. 


Block Matrices 


2.34. Consider the following block matrices (which are partitions of the same matrix): 
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Find the size of each block matrix and also the size of each block. 


(a) The block matrix has two rows of matrices and three columns of matrices; hence, its size is 2 x 3. The 
block sizes are 2 x 2, 2 x 2, and 2 x 1 for the first row; and 1 x 2, 1 x 2, and 1 x 1 for the second row. 


(b) The size of the block matrix is 3 x 2; and the block sizes are 1 x 3 and 1 x 2 for each of the three rows. 


2.35. Compute AB using block multiplication, where 


1 2,1 1 2 3,1 
A=]|3 4'0) and B=ļ|4 5 6'1 
0 0,2 0 0 011 
E F R S : 
Here A = and B = , where E, F, G,R,S,T are the given blocks, and 0,,.. and 04x3 
0, G 0143 T 


are zero matrices of the indicated sites. Hence, 


9 12 15] [3 1 9 12 15 4 
AB = ER ASTA = È 26 | BRN = |19 26 33 7 
0153 GT 


2.36. Let M = diag(A, B,C), where A = i , 


ene — 1 3 . 2 
3 lastes] | Fina ve. 


Because M is block diagonal, square each block: 


2_ [7 10 "A 2_ [16 24 
z aja A ie) E alg 64 |’ 
So 
7 101 
15 22! _ 
M? = (25) 
116 24 


Miscellaneous Problem 


2.37. Let f(x) and g(x) be polynomials and let A be a square matrix. Prove 
(a) (f +g)(A) =f(A) + 8(A), 
(b) (f-g)(A) =f(A)g(A), 
(c) f(A)g(A) = g(A) f(A). 
Suppose f(x) = Xj- a,x" and g(x) = X; bj’. 
(a) We can assume r = s = n by adding powers of x with 0 as their coefficients. Then 


TOETOE > tin 


Hence, (FHA) = Yla; + bA = Ya, Ai +Y b;Ai = f(A) + g(4) 
i i=l i=l 


i=l 


(b) We have f(x)g(x) = 30 a;b;x'™. Then 


ij 


TAA) = (zaa) (x Ha) = ab = (fg)(A) 


(c) Using f(x)g(x) = g(x)f(x), we have 
F(A)g(A) = (f8)(A) = (gf )(4) = 8(A) f(A) 
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Algebra of Matrices 


Problems 2.38-2.41 refer to the following matrices: 
1 2 5 0 1 —3 4 3 7 =l 
a=]; E e A aal 6 =) p= |; —8 J 
2.38. Find (a) 54— 2B, (b)2A+3B, (c)2C- 3D. 
2.39. Find (a) AB and (AB)C, (b) BC and A(BC). [Note that (4B)C = A(BC).] 


2.40. Find (a) A? and 4°, (b) AD and BD, (c) CD. 


2.41. Find (a) A’, (b) B’, (c) (AB)’, (d) A7B™. [Note that ATBT Æ (AB) ] 
Problems 2.42 and 2.43 refer to the following matrices: 


2-3 01 2 
A=|) E Al s=| 4 = “aI C=| 5 -1 -4 2|, D=]-1 
-1 0 0 3 3 


2.42. Find (a)34—4B, (b) AC, (c) BC, (d) AD, (©) BD, (f) CD. 


2.43. Find (a) AT, (b) ATB, (c) ATC. 


1 2 


2.44. Let A= F 6 


| . Find a 2 x 3 matrix B with distinct nonzero entries such that AB = 0. 


a 4 43 a4 
2.45 Let e = [1,0,0], e = [0, 1, 0], e3 = [0,0, 1], and A= bi by b3 bg š Find eA, eÁ, ezÁ. 


2.46. Let e; = [0,...,0,1,0,...,0], where 1 is the ith entry. Show 
(a) eA = A; ith row of A. (c) IfeA = e;B, for each i, then A = B. 
(b) Be? = B’, jth column of B. (d) If Ae? = Bef, for each j, then A = B. 


2.47. Prove Theorem 2.2(iii) and (iv): (iii) (B + C)A = BA + CA, (iv) k(AB) = (kA)B = A(KB). 
2.48. Prove Theorem 2.3: (i) (4+B)’ =A™+B", (ii) (AD) =A, (iii) (kA)! = KAT. 


2.49. Show (a) If A has a zero row, then AB has a zero row. (b) If B has a zero column, then AB has a 
zero column. 


Square Matrices, Inverses 


2.50. Find the diagonal and trace of each of the following matrices: 


2-5 8 ; 3 =f io ek 
(a) A4=|3 -6 -7|, œ B=|6 1 7, © |; ae 4 
4 0 -1 2 -5 -I 


3 1 
2.51. Find (a) 4? and A, (b) f(A) and g(4), where 
[oye =—27 =—5, g(x) =x — 3x + 17. 


Problems 2.51-2.53 refer to A = f ai= k e | e 


Sez >— CHAPTER 2 Algebra of Matrices 


2.53. 


2.54. 


2.55. 


2.56. 


2.57. 


2.58. 


. Find (a) B? and B?, (b) f (B) and g(B), where 


fœ) =x + oe = 22, g(x) =x —3x— 6. 
Find a nonzero column vector u such that Cu = 4u. 


Find the inverse of each of the following matrices (if it exists): 


7 4 2 3 4 —6 5 -2 
e A e A 
1 1 2 1 —1 1 
Find the inverses of A= |1 2 5] andB= |0 1 —1 |. [Hint: See Problem 2.19.] 
13 7 1 3-2 


Suppose A is invertible. Show that if AB = AC, then B = C. Give an example of a nonzero matrix 
A such that AB = AC but BA C. 


Find 2 x 2 invertible matrices A and B such that A + B Æ 0 and A + B is not invertible. 


Show (a) A is invertible if and only if A’ is invertible. (b) The operations of inversion and 
transpose commute; that is, (47) = = (A~!)’. (c) If A has a zero row or zero column, then A is 
not invertible. 


Diagonal and triangular matrices 


2.59. 


2.60. 


2.61. 


2.62. 


2.63. 


2.64. 


2.65. 


2.66. 


2.67. 


Let A = diag(1,2,—3) and B = diag(2, —5, 0). Find 
(a) AB, A”, B?; (b) f(A), where f(x) =x?+4x-—3; (c) A`! and Bo. 


12 1 1 0 
Let A = and B= |0 1 1/].(a) Find A”. (b) Find B”. 
od 0 0 1 


Find all real triangular matrices A such that A* = B, where (a) B = Fi d , (b) B= | : =| A 


Let A= l 4 . Find all numbers k for which A is a root of the polynomial: 
(a) f(x) =x? —7x+10, (b) g(x) =x?-25, (©) h(x) =x -4. 

=e, : : a 
Let B= p 27 |: Find a matrix A such that A’ = B. 


. Find a triangular matrix A with positive diagonal entries such that A* = B. 


Using only the elements 0 and 1, find the number of 3 x 3 matrices that are (a) diagonal, 
(b) upper triangular, (c) nonsingular and upper triangular. Generalize to n x n matrices. 
Let D, = kI, the scalar matrix belonging to the scalar k. Show 

(a) D,A = KA, (b) BD, = kB, (c) D, + Dy = Diak (d) D,Dy = Dyg 


Suppose AB = C, where A and C are upper triangular. 


(a) Find 2 x 2 nonzero matrices A,B,C, where B is not upper triangular. 
(b) Suppose 4 is also invertible. Show that B must also be upper triangular. 
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Special Types of Real Matrices 


2.68. Find x,y,z such that 4 is symmetric, where 


2 x 3 7 —6 2x 
(a) 4=]4 5 yl, b) A=|y z —2 
z 1 7 x —2 5 


2.69. Suppose A is a square matrix. Show (a) 4+ A? is symmetric, (b) A— A7 is skew-symmetric, 
(c) A=B +C, where B is symmetric and C is skew-symmetric. 


2.70. Write A = f d as the sum of a symmetric matrix B and a skew-symmetric matrix C. 


2.71. Suppose A and B are symmetric. Show that the following are also symmetric: 
(a) A+B; (b) kA, for any scalar k; (c) A?; 
(d) A”, forn >0; (e) f(A), for any polynomial f(x). 


2.72. Find a 2 x 2 orthogonal matrix P whose first row is a multiple of 


(a) (3,—4), (b) (1,2). 


2.73. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of 
(a) (1,2,3) and (0,—2,3), (b) (1,3,1) and (1,0,—1). 


2.74. Suppose A and B are orthogonal matrices. Show that AT, 4~!, AB are also orthogonal. 


1 1 1 
2.75. Which of the following matrices are normal? 4 = = ,B= a ,C=]0 1 1 
4 3 2 3 
0 0 1 
Complex Matrices 3 x+2i y 


2.76. Find real numbers x,y,z such that A is Hermitian, where A = | 3 — 2i 0 l +zi 


2.77. Suppose A is a complex matrix. Show that 44” and AFA are Hermitian. 


2.78. Let A be a square matrix. Show that (a) A+ A” is Hermitian, (b) A — A” is skew-Hermitian, 
(c) A=B +C, where B is Hermitian and C is skew-Hermitian. 


2.79. Determine which of the following matrices are unitary: 


j Sab raith e 


2.80. Suppose A and B are unitary. Show that A”, A~!, AB are unitary. 


2.81. Determine which of the following matrices are normal: A = 


eu 1 
1 0 
sahi ‘|: 


2+ M and 
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Block Matrices 


1 210 0 0 P oe 

3 40 0 0 E 

2.82. Let U = i and V= |0 01 1 2 
0 015 1 2 

00,3 4 1 i 

| 0 O-4 1 


(a) Find UV using block multiplication. (b) Are U and V block diagonal matrices? 
(c) Is UV block diagonal? 


2.83. Partition each of the following matrices so that it becomes a square block matrix with as many 
diagonal blocks as possible: 


12 0 0 0 
1 0 0 3 0 0 0 0 0 1 0 
A=/;}0 0 2 B={]0 0 4 0 0j, C=]0 0 0 
0 0 3 005 0 0 2 0 0 
000 0 6 
2,9 0 0 I 1/0 0 
; Oil 4,0 2 310 0 
2 3 _ ae eee ae 
2.84. Find M^ and M’? for (a) M = 0'2 110 ,(b) M 0 ori 5 
0,0 0, 3 0 0,4 5 
2.85. For each matrix M in Problem 2.84, find f (M) where f(x) = x? + 4x — 5. 


2.86. Suppose U = [U;,| and V = [V;,;| are block matrices for which UV is defined and the number of 
columns of each block U; is equal to the number of rows of each block V;;. Show that UV = [W, 


1, 
ij 
where Wi = J>; Vie Vy: 


2.87. Suppose M and N are block diagonal matrices where corresponding blocks have the same size, 
say M = diag(A,) and N = diag(B;). Show 
(i) M +N = diag(A; + B;), (iii) MN = diag(A;B;), 
(ii) kM = diag(kA;), (iv) f(M) = diag( f(4;)) for any polynomial f(x). 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: 4=[R,; R; ...] denotes a matrix A with rows R,,R5,.... 
2.38. (a) [—5,10; 27, —34)], (b) [17,4; -—12, 13], (e) [—7,-27,11; -—8,36, —37] 


2.39. (a) [-7,14; 39,—28], [21,105,-98; —17,—285,296] 
(b) [5,—15,20; 8,60,—59], [21,105,—-98; —17,—285, 296] 


2.40. (a) [7,-6; —9,22], [-11,38; 57,—106]; 
(b) [11,—9,17; —7,53,—39], [15,35,-5; 10,—98,69]; (c) not defined 


2.41. (a) [1,3; 2,—4], (b) [5,-6; 0,7], (c) [-7,39; 14,—-28], (d) [5,15; 10, —40] 


2.42. (a) [—-13,-3,18; 4,17,0], (Œ) [—5,—2,4,5; 11,3, —12, 18], 
(c) [11,—12,0,—5; -—15,5,8,4], (d) [9; 9], (e) [-1; 9], (f) not defined 
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2.43. (a) [1,0; —1,3; 2,4], (b) [4,0,—3; —7,—6,12; 4,—8,6], (c) not defined 
2.44. |2,4,6; —1,—2,-—3] 

2.45. [a;,a2,a3,a4]), [b1 b2, b3, ba], [e1,€2,€3,¢4] 

2.50. (a) 2,—6,—1,tr(A) = —5, (b) 1,1,—1,tr(B)= 1, (c) not defined 

2.51. (a) [—11,-15; 9,—14], [-67,40; —24,—59], (b) [—50,70; —42, —36], g(A) =0 
2.52. (a) [14,4; —2,34], [60,—52; 26, —200], (b) f(B) =0, [—4,10; —5, 46] 

2.53. u = [2a, a|" 

2.54. [3,—4; -5,7], [-3,3 2,—1], not defined, [1,-% 2,- 


2.55. [1,1,—1; 2,—5,3; 21). [l,1,0; —1,—3,1; —1,—4,1] 


2.56. A = [1,2; 1,2], B = [0,0; 1,1], C= [2,2; 0,0] 
2.57. A = [1,2; 0,3]; B= [4,3; 3,0] 
2.58. (c) Hint: Use Problem 2.48 


2.59. (a) AB = diag(2, —10,0), A* =diag(1,4,9), B? = diag(4, 25,0); 
(b) f(A) = diag(2,9,—6); cy «tS diag(1,5, — 4), C7! does not exist 
2.60. (a) [1,2n; 0,1], (b) [1,n,5n(n —1); 0,1,7; 0,0,1] 


2.61. (a) [2,3; 0,5], [—2,-3; 0,—5], [2,-7; 0,—5], [-2,7; 0,5], (b) none 


2.62. (a) k=2, (b) k= -5, (c) none 

2.63. [1,0; 2,3] 

2.64. |1,2,1; 0,3,1; 0,0,2] 

2.65. All entries below the diagonal must be 0 to be upper triangular, and all diagonal entries must be 1 
to be nonsingular. 
(a) 8(2"), O Se. y A) 

2.67. (a) A=[1,1; 0,0], B=[1,2; 3,4], C = [4,6; 0,0] 

2.68. (a) x=4,y=1,z=3; (b) x= 0, y= —6, z any real number 

2.69. (c) Hint: Let B =} (A +A") and C =} (4 — A’). 


2.70. B = [4,3; 3,3], C= [0,2; —2,0] 


2.72. (a) B= 43, (©) [I/v5,2/v5; 2/v5,—1/v5] 


2.73. (a) [1/v14, 2/V14, 3/V14; 0,—2/v13, 3/V13; 12/157, —3/v157, —2/v157] 
œ) [1/V11, 3/V11, 1/711; 1/2, 0,-1/V2; 3/722, —2/+/22, 3/v22] 


2.75. A,C 


2.84. 


2.85. 
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-x=3,y=0,z=3 


_A 


. (©) Hint: Let B= 1 (A + A”) and C =} (A — A"). 
. A,B,C 
. (a) UV = diag([7,6; 17,10]; [-1,9; 7,—5]); (b) no; (c) yes 


. A: line between first and second rows (columns); 


B: line between second and third rows (columns) and between fourth and fifth rows (columns); 
C: C itself—no further partitioning of C is possible. 


(a) M? = diag([4], [9,8; 4,9], [9]), 

M? = diag([8], [25,44; 22,25], [27]) 

(b) M? =diag([3,4; 8,11], [9,12; 24,33]) 

M? = diag([11,15; 30,41], [57,78; 156,213]) 


(a) diag((7], [8,24; 12,8], [16]), (b) diag([2,8; 16,181], [8,20; 40,48) 


Systems of Linear 
Equations 


3.1 Introduction 


Systems of linear equations play an important and motivating role in the subject of linear algebra. In fact, 
many problems in linear algebra reduce to finding the solution of a system of linear equations. Thus, the 
techniques introduced in this chapter will be applicable to abstract ideas introduced later. On the other 
hand, some of the abstract results will give us new insights into the structure and properties of systems of 
linear equations. 

All our systems of linear equations involve scalars as both coefficients and constants, and such scalars 
may come from any number field K. There is almost no loss in generality if the reader assumes that all 
our scalars are real numbers—that is, that they come from the real field R. 


3.2 Basic Definitions, Solutions 


This section gives basic definitions connected with the solutions of systems of linear equations. The 
actual algorithms for finding such solutions will be treated later. 


Linear Equation and Solutions 


A linear equation in unknowns x,,X>,...,X, is an equation that can be put in the standard form 
aix + AX) + +--+ a,x, = D (3.1) 
where a,,4,...,d,, and b are constants. The constant a, is called the coefficient of x,, and b is called the 


constant term of the equation. 
A solution of the linear equation (3.1) is a list of values for the unknowns or, equivalently, a vector u in 
K”, say 


=k, 


X,=hh, wy =k, ..., x n or u = (ki, ko,...,k,) 

such that the following statement (obtained by substituting k; for x; in the equation) is true: 
aiki + azk +--+ +a,k, = b 

In such a case we say that u satisfies the equation. 


Remark: Equation (3.1) implicitly assumes there is an ordering of the unknowns. In order to avoid 
subscripts, we will usually use x,y for two unknowns; x,y,z for three unknowns; and x,y,z, t for four 


unknowns; they will be ordered as shown. 
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EXAMPLE 3.1 Consider the following linear equation in three unknowns x, y, z: 
x+2y—3z=6 

We note that x = 5,y = 2,z = 1, or, equivalently, the vector u = (5,2, 1) is a solution of the equation. That is, 
5 + 2(2) — 3(1) =6 or 5+4-3=6 or 6=6 


On the other hand, w = (1,2,3) is not a solution, because on substitution, we do not get a true statement: 


142(2)-3(3)=6 or 14+4-9=6 or -4=6 


System of Linear Equations 


A system of linear equations is a list of linear equations with the same unknowns. In particular, a system 


of m linear equations L4, L3, ..., Lm in n unknowns x,,Xx,,...,x, can be put in the standard form 
Qiri T Qiza ainXn = Dy 
Ay X1 + AyaXy + +++ + AanăXn = by (3.2) 


Am1X1 + An2X2 FF AmnXn = Dy 
where the a; and b, are constants. The number a, is the coefficient of the unknown x; in the equation L,, 
and the number b; is the constant of the equation L;. 


The system (3.2) is called an m x n (read: m by n) system. It is called a square system if m = n—that 
is, if the number m of equations is equal to the number n of unknowns. 


The system (3.2) is said to be homogeneous if all the constant terms are zero—that is, if b, = 0, 
b, = 0,...,5,, = 0. Otherwise the system is said to be nonhomogeneous. 


A solution (or a particular solution) of the system (3.2) is a list of values for the unknowns or, 
equivalently, a vector u in K”, which is a solution of each of the equations in the system. The set of all 
solutions of the system is called the solution set or the general solution of the system. 


EXAMPLE 3.2 Consider the following system of linear equations: 


xi + xX + 4x3 + 3x4 = 5 


2x, + 3x, + x3 — 2x4 = 1 


X, + 2x, — 5x3 + 4x4 = 3 


It is a 3 x 4 system because it has three equations in four unknowns. Determine whether (a) u = (—8,6, 1,1) and 
(b) v = (—10,5, 1,2) are solutions of the system. 


(a) Substitute the values of u in each equation, obtaining 


8+6+4+4(1)+3(1) =5 or 8+6+44+3=5 or 5=5 
2(—8)+3(6)+1-2(1)=1 or 16+18+1 =1 or 1=1 
8 + 2(6) — 5(1)+4(1) = or 8+12-544=3 of 333 


Yes, u is a solution of the system because it is a solution of each equation. 
(b) Substitute the values of v into each successive equation, obtaining 


10+5+4(1)+3(2)=5 or —10454+446=5 or 5= 
2(—10) + 3(5) +1—2(2) =1 or —204+15+1-4=1 or -8 =1 


No, v is not a solution of the system, because it is not a solution of the second equation. (We do not need to 
substitute v into the third equation.) 
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The system (3.2) of linear equations is said to be consistent if it has one or more solutions, and it is 
said to be inconsistent if it has no solution. If the field K of scalars is infinite, such as when K is the real 
field R or the complex field C, then we have the following important result. 


THEOREM 3.1: Suppose the field K is infinite. Then any system # of linear equations has 
(i) a unique solution, (ii) no solution, or (iii) an infinite number of solutions. 


This situation is pictured in Fig. 3-1. The three cases have a geometrical description when the system 
£ consists of two equations in two unknowns (Section 3.4). 


System of linear equations 


No Unique Infinite number 
solution solution of solutions 


Figure 3-1 
Augmented and Coefficient Matrices of a System 


Consider again the general system (3.2) of m equations in n unknowns. Such a system has associated with 
it the following two matrices: 


ay, a\2 Parser din bi ay, ay? ade din 

a a a b a a a 
M= 21 22 2n 2 and A= 21 22 2n 

Ami Am2 +++ Amn b, Ami Am2 amn 


The first matrix M is called the augmented matrix of the system, and the second matrix A is called the 
coefficient matrix. 

The coefficient matrix A is simply the matrix of coefficients, which is the augmented matrix M without 
the last column of constants. Some texts write M = [A,B] to emphasize the two parts of M, where B 
denotes the column vector of constants. The augmented matrix M and the coefficient matrix A of the 
system in Example 3.2 are as follows: 


1 1 4 3 5 1 1 3 
M=]|2 3 1 =2 1 and A= |2 3 1 -—2 
1 2 =5 4 3 1 2 —5 4 


As expected, A consists of all the columns of M except the last, which is the column of constants. 

Clearly, a system of linear equations is completely determined by its augmented matrix M, and vice 
versa. Specifically, each row of M corresponds to an equation of the system, and each column of M 
corresponds to the coefficients of an unknown, except for the last column, which corresponds to the 
constants of the system. 


Degenerate Linear Equations 


A linear equation is said to be degenerate if all the coefficients are zero—that is, if it has the form 


Ox, + Ox. +---+0x, =b (3.3) 
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The solution of such an equation depends only on the value of the constant b. Specifically, 
(i) If b # 0, then the equation has no solution. 
(ii) If b = 0, then every vector u = (ki, k2, ...,k„) in K” is a solution. 


The following theorem applies. 


THEOREM 3.2: Let ¥ be a system of linear equations that contains a degenerate equation L, say with 
constant b. 


(i) Ifb #0, then the system Z has no solution. 


(ii) Ifb = 0, then L may be deleted from the system without changing the solution 
set of the system. 


Part (i) comes from the fact that the degenerate equation has no solution, so the system has no solution. 
Part (ii) comes from the fact that every element in K” is a solution of the degenerate equation. 


Leading Unknown in a Nondegenerate Linear Equation 


Now let L be a nondegenerate linear equation. This means one or more of the coefficients of L are not 
zero. By the leading unknown of L, we mean the first unknown in L with a nonzero coefficient. For 
example, x, and y are the leading unknowns, respectively, in the equations 


Ox, + Ox, + 5x3 + 6x4 + 0x; + 8x6 = 7 and Ox + 2y—4z=5 
We frequently omit terms with zero coefficients, so the above equations would be written as 
5x3 + 6x4 + 8x6 = 7 and 2y—4z=5 


In such a case, the leading unknown appears first. 


3.3 Equivalent Systems, Elementary Operations 


Consider the system (3.2) of m linear equations in n unknowns. Let L be the linear equation obtained by 
multiplying the m equations by constants c,,C,...,C,, respectively, and then adding the resulting 
equations. Specifically, let L be the following linear equation: 


(ejay os Cina Xj penez (Cian paren Cmamn)Xn = G0 Hec b 


mm 


Then L is called a linear combination of the equations in the system. One can easily show (Problem 3.43) 
that any solution of the system (3.2) is also a solution of the linear combination L. 


EXAMPLE 3.3 Let L,, L), L} denote, respectively, the three equations in Example 3.2. Let L be the 
equation obtained by multiplying L,, L2, L} by 3, —2, 4, respectively, and then adding. Namely, 


3L: 3x, + 3x, + 12x; + 9x4 = 15 
—2L),: 4x, — 6x, — 2x; + 4x4 = —2 
4L: 4x, 8x2 20x3 16x4 = 12 


(Sum) L: 3x, + 5x3 — 10x, + 29x4 = 25 
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Then L is a linear combination of L,, L}, L}. As expected, the solution u = (—8,6,1,1) of the system is also a 
solution of L. That is, substituting u in L, we obtain a true statement: 


3(—8) + 5(6) — 10(1) +29(1) =25 or 244+30-10+29=25 or 9=9 


The following theorem holds. 


THEOREM 3.3: Two systems of linear equations have the same solutions if and only if each equation in 
each system is a linear combination of the equations in the other system. 


Two systems of linear equations are said to be equivalent if they have the same solutions. The next 
subsection shows one way to obtain equivalent systems of linear equations. 


Elementary Operations 


The following operations on a system of linear equations L4, L3, . . - , Lm are called elementary operations. 


[E;] Interchange two of the equations. We indicate that the equations L; and L; are interchanged by 
writing: 
“Interchange L; and L,” or “Li — L;” 
[E2] Replace an equation by a nonzero multiple of itself. We indicate that equation L; is replaced by kL; 
(where k Æ 0) by writing 
“Replace L; by kL;” or “kL, > L;” 
[E,] Replace an equation by the sum of a multiple of another equation and itself. We indicate that 
equation L; is replaced by the sum of kL; and L; by writing 
“Replace L; by kL; + L,” or “KL, +L; > L;” 
The arrow — in [E,] and [E}] may be read as ‘‘replaces.”’ 


The main property of the above elementary operations is contained in the following theorem (proved 
in Problem 3.45). 


THEOREM 3.4: Suppose a system of .@ of linear equations is obtained from a system ¥ of linear 
equations by a finite sequence of elementary operations. Then .W and Z have the same 
solutions. 


Remark: Sometimes (say to avoid fractions when all the given scalars are integers) we may apply 
[E] and [E3] in one step; that is, we may apply the following operation: 


[E] Replace equation L, by the sum of kL; and k'L, (where k' # 0), written 
“Replace L; by kL; + KL” or “KL, + kL; > L?” 


We emphasize that in operations [E;] and [E], only equation L; is changed. 


Gaussian elimination, our main method for finding the solution of a given system of linear 
equations, consists of using the above operations to transform a given system into an equivalent 
system whose solution can be easily obtained. 


The details of Gaussian elimination are discussed in subsequent sections. 


3.4 Small Square Systems of Linear Equations 


This section considers the special case of one equation in one unknown, and two equations in two 
unknowns. These simple systems are treated separately because their solution sets can be described 
geometrically, and their properties motivate the general case. 
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Linear Equation in One Unknown 


The following simple basic result is proved in Problem 3.5. 


THEOREM 3.5: Consider the linear equation ax = b. 
(i) Ifa #0, then x = b/a is a unique solution of ax = b. 
(ii) Ifa = 0, but b Æ 0, then ax = b has no solution. 


(iii) Ifa = 0 and b = 0, then every scalar k is a solution of ax = b. 


EXAMPLE 3.4 Solve (a) 4x—1=x+6,(b) 2x—5—x=x+3,(c) 4+x-3=2x+1-—x. 


(a) Rewrite the equation in standard form obtaining 3x = 7. Then x = i is the unique solution [Theorem 3.5(i)]. 
(b) Rewrite the equation in standard form, obtaining 0x = 8. The equation has no solution [Theorem 3.5(ii)]. 


(c) Rewrite the equation in standard form, obtaining 0x = 0. Then every scalar k is a solution [Theorem 3.5(iii)]. 


System of Two Linear Equations in Two Unknowns (2x2 System) 


Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put in 
the standard form 


A\x+ By =C, 
Ax + Bay = C, 


(3.4) 


Because the equations are nondegenerate, A,; and B, are not both zero, and A, and B, are not both zero. 

The general solution of the system (3.4) belongs to one of three types as indicated in Fig. 3-1. If R is 
the field of scalars, then the graph of each equation is a line in the plane R? and the three types may be 
described geometrically as pictured in Fig. 3-2. Specifically, 


(1) The system has exactly one solution. 
Here the two lines intersect in one point [Fig. 3-2(a)]. This occurs when the lines have distinct 
slopes or, equivalently, when the coefficients of x and y are not proportional: 


A, , Bı 


or, equivalently, A,B, — A,B, #0 
A,” B, 


For example, in Fig. 3-2(a), 1/3 4 —1/2. 


Li and L 


L: x —y = 4 L:x+3y=3 Li: x + 2y = 
In: 3x + 2y = 12 Ly: 2x + 6y = -8 Ly: 2x + 4y = 8 


(a) (b) (c) 


Figure 3-2 
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(2) The system has no solution. 
Here the two lines are parallel [Fig. 3-2(b)]. This occurs when the lines have the same slopes but 
different y intercepts, or when 


Ay By LG 

A, By” Cy 

For example, in Fig. 3-2(b), 1/2 = 3/6 4 —3/8. 
(3) The system has an infinite number of solutions. 


Here the two lines coincide [Fig. 3-2(c)]. This occurs when the lines have the same slopes and same 
y intercepts, or when the coefficients and constants are proportional, 


A, B GC 
A, B, Q 
For example, in Fig. 3-2(c), 1/2 = 2/4 = 4/8. 


Remark: The following expression and its value is called a determinant of order two: 


A, Bı 
A, B, 


= A,B, — A,B, 


Determinants will be studied in Chapter 8. Thus, the system (3.4) has a unique solution if and only if the 
determinant of its coefficients is not zero. (We show later that this statement is true for any square system 
of linear equations.) 


Elimination Algorithm 


The solution to system (3.4) can be obtained by the process of elimination, whereby we reduce the system 
to a single equation in only one unknown. Assuming the system has a unique solution, this elimination 
algorithm has two parts. 


ALGORITHM 3.1: The input consists of two nondegenerate linear equations Z, and L, in two 
unknowns with a unique solution. 


Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coefficients of 
one unknown are negatives of each other, and then add the two equations to obtain a new 
equation Z that has only one unknown. 


Part B. (Back-Substitution) Solve for the unknown in the new equation L (which contains only one 
unknown), substitute this value of the unknown into one of the original equations, and then 
solve to obtain the value of the other unknown. 


Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique 
solution. In such a case, the new equation L will be degenerate and Part B will not apply. 


EXAMPLE 3.5 (Unique Case). Solve the system 
Lı: 2x — 3y = -8 
Ly: 3x+4y= 5 
The unknown x is eliminated from the equations by forming the new equation L = —3L, + 2L,. That is, we 
multiply ZL, by —3 and L, by 2 and add the resulting equations as follows: 
—3L,;: —6x+9y = 24 
2L:  6x+ 8y= 10 


Addition : 17y = 34 


Ce »—_———— CHAPTER 3 Systems of Linear Equations 


We now solve the new equation for y, obtaining y = 2. We substitute y = 2 into one of the original equations, say 
L,, and solve for the other unknown x, obtaining 


2x — 3(2) = —8 or 2x —6 = 8 or 2x = —2 or x=-l 


Thus, x = —1, y = 2, or the pair u = (—1, 2) is the unique solution of the system. The unique solution is expected, 
because 2/3 4 —3/4. [Geometrically, the lines corresponding to the equations intersect at the point (—1,2).] 


EXAMPLE 3.6 (Nonunique Cases) 
(a) Solve the system 

L: x—3y=4 

Ly: —2x+6y=5 


We eliminated x from the equations by multiplying L, by 2 and adding it to L,—that is, by forming the new 
equation L = 2L; + L,. This yields the degenerate equation 


Ox + Oy = 13 


which has a nonzero constant b = 13. Thus, this equation and the system have no solution. This is expected, 
because 1/(—2) = —3/6 4 4/5. (Geometrically, the lines corresponding to the equations are parallel.) 
(b) Solve the system 
Ly: x—3y=4 
Ly: —2x+ 6y = —8 


We eliminated x from the equations by multiplying L, by 2 and adding it to L,—that is, by forming the new 
equation L = 2L, + L,. This yields the degenerate equation 


Ox + Oy = 0 


where the constant term is also zero. Thus, the system has an infinite number of solutions, which correspond to 
the solutions of either equation. This is expected, because 1/(—2) = —3/6 = 4/(—8). (Geometrically, the lines 
corresponding to the equations coincide.) 

To find the general solution, let y = a, and substitute into L; to obtain 


x—3a=4 or x=3a+4 
Thus, the general solution of the system is 


x=3a+4,y=a or u = (3a + 4, a) 


where a (called a parameter) is any scalar. 


3.5 Systems in Triangular and Echelon Forms 


The main method for solving systems of linear equations, Gaussian elimination, is treated in Section 3.6. 
Here we consider two simple types of systems of linear equations: systems in triangular form and the 
more general systems in echelon form. 


Triangular Form 
Consider the following system of linear equations, which is in triangular form: 
2x1 = 3X5 + 5x3 = 2X4 = 9 
5x — x3+3x,= 1 
1x3 = X= 3 
2x4 =$ 
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That is, the first unknown x, is the leading unknown in the first equation, the second unknown x, is the 
leading unknown in the second equation, and so on. Thus, in particular, the system is square and each 
leading unknown is directly to the right of the leading unknown in the preceding equation. 

Such a triangular system always has a unique solution, which may be obtained by back-substitution. 
That is, 


(1) First solve the last equation for the last unknown to get x, = 4. 


(2) Then substitute this value x, = 4 in the next-to-last equation, and solve for the next-to-last unknown 
x3 as follows: 


7x, -4=3 or Tx; = 7 or x =l 


(3) Now substitute x, = 1 and x, = 4 in the second equation, and solve for the second unknown x, as 
follows: 
5x, -1+12=1 or 5x, + 11=1 or 5x, = —10 or XxX, = -2 


(4) Finally, substitute x, = —2, x; = 1, x, = 4 in the first equation, and solve for the first unknown x, as 
follows: 
2x, +6+5-8=9 or 2x, +3=9 or 2x, = 6 or x, =3 


Thus, x, =3 , x» = —2, x, = 1, x4 = 4, or, equivalently, the vector u = (3,—2,1,4) is the unique 
solution of the system. 


Remark: There is an alternative form for back-substitution (which will be used when solving a 
system using the matrix format). Namely, after first finding the value of the last unknown, we substitute 
this value for the last unknown in all the preceding equations before solving for the next-to-last 
unknown. This yields a triangular system with one less equation and one less unknown. For example, in 
the above triangular system, we substitute x, = 4 in all the preceding equations to obtain the triangular 
system 

2x, — 3X) + 5x3 = 17 
5x, -— Xy ==] 
7x3= 7 


We then repeat the process using the new last equation. And so on. 


Echelon Form, Pivot and Free Variables 


The following system of linear equations is said to be in echelon form: 


2x, + 6x, — x3 + 4x4 — 2x5 = 15 
X3+2x4,+2x5= 5 
3x4 = 9x. = 6 


That is, no equation is degenerate and the leading unknown in each equation other than the first is to the 
right of the leading unknown in the preceding equation. The leading unknowns in the system, x,, x3, X4, 
are called pivot variables, and the other unknowns, x, and xs, are called free variables. 

Generally speaking, an echelon system or a system in echelon form has the following form: 


411X] + 12X2 + Ay3X3 + GygXq +--+ + Ay yX_ = by 
Ay Xj + Arp 41ăXj y1 +++ + AX = Bo 
i (3.5) 
pope <j <+: <j, and ay), ay,,...,a,;, are not zero. The pivot variables are x1, x;,,...,x;,. Note 
that r < n. 


The solution set of any echelon system is described in the following theorem (proved in Problem 3.10). 
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THEOREM 3.6: Consider a system of linear equations in echelon form, say with r equations in n 
unknowns. There are two cases: 


(i) r =n. That is, there are as many equations as unknowns (triangular form). Then 
the system has a unique solution. 

Gi) r < n. That is, there are more unknowns than equations. Then we can arbitrarily 
assign values to the n— r free variables and solve uniquely for the r pivot 
variables, obtaining a solution of the system. 


Suppose an echelon system contains more unknowns than equations. Assuming the field K is infinite, 
the system has an infinite number of solutions, because each of the n — r free variables may be assigned 
any scalar. 

The general solution of a system with free variables may be described in either of two equivalent ways, 
which we illustrate using the above echelon system where there are r = 3 equations and n = 5 unknowns. 
One description is called the ‘‘Parametric Form’’ of the solution, and the other description is called the 
“*Free-Variable Form.”’ 


Parametric Form 


Assign arbitrary values, called parameters, to the free variables x, and x5, say x, = a and x; = b, and 
then use back-substitution to obtain values for the pivot variables x,, x3, x; in terms of the parameters a 
and b. Specifically, 


(1) Substitute x; = b in the last equation, and solve for x4: 
3x, =- 9b = 6 or 3x, = 6+ 9b or X4=2+3b 

(2) Substitute x, = 2 + 3b and x; = b into the second equation, and solve for x3: 
x3 +2(2+3b)+2b=5 o x3+4+8b=5 or x;=1-8b 


(3) Substitute x, = a, x3 = 1 — 8b, x4 = 2 + 3b, x; = b into the first equation, and solve for x,: 


2x, + 6a — (1 — 8b) + 4(2 + 3b) — 2b = 15 or x, =4-— 3a -— 9b 


Accordingly, the general solution in parametric form is 

x, =4-— 3a — 9b, Xx =a, x, = 1 — 8b, x4 = 2 + 3b, x; = b 
or, equivalently, v = (4 — 3a — 9b, a, 1 — 8b, 2+ 3b, b) where a and b are arbitrary numbers. 
Free-Variable Form 


Use back-substitution to solve for the pivot variables x,, x3, x4 directly in terms of the free variables x, 
and x;. That is, the last equation gives x, = 2 + 3x5. Substitution in the second equation yields 
x3 = 1 — 8xs, and then substitution in the first equation yields x, = 4 — 3x, — 9x5. Accordingly, 


x, =4-— 3x, — 9x5, x, = free variable, x, = 1 — 8x5;, %4y=2+4+3x5, x, = free variable 
or, equivalently, 
v = (4 — 3x3 — 9x5, X2, l — 8x5, 2+ 3x5, x5) 


is the free-variable form for the general solution of the system. 
We emphasize that there is no difference between the above two forms of the general solution, and the 
use of one or the other to represent the general solution is simply a matter of taste. 


Remark: A particular solution of the above system can be found by assigning any values to the free 
variables and then solving for the pivot variables by back-substitution. For example, setting x, = 1 and 
x; = l, we obtain 


x4 =2+3=5, x=l1-8=-7, x, =4-3-9=-8 


Thus, u = (—8,1,7,5, 1) is the particular solution corresponding to x, = 1 and x; = 1. 
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3.6 Gaussian Elimination 


The main method for solving the general system (3.2) of linear equations is called Gaussian elimination. 
It essentially consists of two parts: 


Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate 
equation with no solution (which indicates the system has no solution) or an equivalent simpler 
system in triangular or echelon form. 


Part B. (Backward Elimination) Step-by-step back-substitution to find the solution of the simpler 
system. 


Part B has already been investigated in Section 3.4. Accordingly, we need only give the algorithm for 
Part A, which is as follows. 


ALGORITHM 3.2 for (Part A): Input: The m x n system (3.2) of linear equations. 
ELIMINATION STEP: Find the first unknown in the system with a nonzero coefficient (which now 


must be xı). 


(a) Arrange so that a,, # 0. That is, if necessary, interchange equations so that the first unknown x, 
appears with a nonzero coefficient in the first equation. 


(b) Use a,, as a pivot to eliminate x, from all equations except the first equation. That is, for i > 1: 
(1) Set m = —a; /a11; (2) Replace L; by mL, + L; 


The system now has the following form: 


1X1 + 412X27 + 443%3 + +++ + Ay X_ = by 
Any Xj + +++ + Aa Xp = b2 


Anj,X;, Tt AmnXn = b, 


mjz 


where x, does not appear in any equation except the first, a}; # 0, and x, denotes the first 
unknown with a nonzero coefficient in any equation other than the first. 


(c) Examine each new equation L. 
(1) If Z has the form Ox, + Ox, + ---+ Ox, = b with b Æ 0, then 
STOP 


The system is inconsistent and has no solution. 


(2) If L has the form 0x, + 0x, + ---+ 0x, = 0 or if Z is a multiple of another equation, then delete 
L from the system. 


RECURSION STEP: Repeat the Elimination Step with each new ‘‘smaller’’ subsystem formed by all 
the equations excluding the first equation. 


OUTPUT: Finally, the system is reduced to triangular or echelon form, or a degenerate equation with 
no solution is obtained indicating an inconsistent system. 
The next remarks refer to the Elimination Step in Algorithm 3.2. 
(1) The following number m in (b) is called the multiplier: 


a; _ coefficient to be deleted 


ayy pivot 
(2) One could alternatively apply the following operation in (b): 
Replace L; by — aali + aLi 


This would avoid fractions if all the scalars were originally integers. 
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Gaussian Elimination Example 
Here we illustrate in detail Gaussian elimination using the following system of linear equations: 


Li x—3y—-2z= 6 
L, 2x—4y—3z= 8 
L, 3x + 6y + 8z = —5 


Part A. We use the coefficient 1 of x in the first equation L as the pivot in order to eliminate x from 
the second equation L, and from the third equation L}. This is accomplished as follows: 
(1) Multiply L; by the multiplier m = —2 and add it to L,; that is, ‘‘Replace L, by —2Z, + L,.”’ 


(2) Multiply Z, by the multiplier m = 3 and add it to L}; that is, ‘“Replace L} by 3L, + L3.” 
These steps yield 


(—2)L): 2x + 6y + 4z = -12 3L: 3x — 9y — 6z = 18 
Ly: 2x—4y—3z= 8 L: 3x + 6y + 8z = —5 
New L: 2y+ 2=—4 New Ls: Tee de 


Thus, the original system is replaced by the following system: 


Ly: x—3y-—2z= 6 
L,: 2y+z=-4 
L3: —3y+2z= 13 


(Note that the equations L, and L, form a subsystem with one less equation and one less unknown than 
the original system.) 
Next we use the coefficient 2 of y in the (new) second equation L, as the pivot in order to eliminate y 
from the (new) third equation L}. This is accomplished as follows: 
(3) Multiply Z, by the multiplier m =; and add it to L,; that is, ‘‘Replace L}, by 31, + 13.” 
(Alternately, ‘‘Replace L} by 3L, + 2L3,’’ which will avoid fractions.) 


This step yields 


zla: 3y +37 = —6 3L: 6y + 3z = —12 
fs: RATM or 2L;: —6y+4z= 26 


Thus, our system is replaced by the following system: 


Li: x=3y=-2z= 6 
L,: 2y+ z=-4 


Lz: 7z= 14 (orłz=7]) 


The system is now in triangular form, so Part A is completed. 


Part B. The values for the unknowns are obtained in reverse order, z,y,x, by back-substitution. 
Specifically, 


(1) Solve for z in L; to get z = 2. 
(2) Substitute z = 2 in L,, and solve for y to get y = —3. 
(3) Substitute y = —3 and z = 2 in L4, and solve for x to get x = 1. 


Thus, the solution of the triangular system and hence the original system is as follows: 


x=1, y=-3, z=2 or, equivalently, u = (1,-3,2). 
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Condensed Format 


The Gaussian elimination algorithm involves rewriting systems of linear equations. Sometimes we can 
avoid excessive recopying of some of the equations by adopting a ‘‘condensed format.’’ This format for 
the solution of the above system follows: 


Number Equation Operation 
(1) x—3y-—2z= 6 
(2) 2x — 4y- 3z= 8 
(3) 3x + 6y + 8z = —5 
(2') 2y+ z= -4 Replace L, by —2L, +L, 
(3’) —3y+2z= 13 Replace L; by 3L, + L; 
(3) 7z= 14 Replace L, by 3L, + 2L, 


That is, first we write down the number of each of the original equations. As we apply the Gaussian 
elimination algorithm to the system, we only write down the new equations, and we label each new equation 
using the same number as the original corresponding equation, but with an added prime. (After each new 
equation, we will indicate, for instructional purposes, the elementary operation that yielded the new equation.) 
The system in triangular form consists of equations (1), (2’), and (3”), the numbers with the largest 
number of primes. Applying back-substitution to these equations again yields x = 1, y = —3, z = 2. 


Remark: If two equations need to be interchanged, say to obtain a nonzero coefficient as a pivot, 
then this is easily accomplished in the format by simply renumbering the two equations rather than 
changing their positions. 

EXAMPLE 3.7 Solve the following system: *+2yv— 3z= 1 
2x+5y— 82=4 
3x + 8y — 13z= 7 


We solve the system by Gaussian elimination. 


Part A. (Forward Elimination) We use the coefficient 1 of x in the first equation L, as the pivot in order to 
eliminate x from the second equation L, and from the third equation L;. This is accomplished as follows: 
(1) Multiply L, by the multiplier m = —2 and add it to L,; that is, ‘‘Replace L, by —2Z, + L3.” 
(2) Multiply Z, by the multiplier m = —3 and add it to L,; that is, ““Replace L; by —3Z, + L3.” 


The two steps yield 


x+2y—3z=1 Pe ae ee 
y—2z=2 3-9 
2y —42=4 a 


(The third equation is deleted, because it is a multiple of the second equation.) The system is now in echelon form 
with free variable z. 


Part B. (Backward Elimination) To obtain the general solution, let the free variable z = a, and solve for x and y 
by back-substitution. Substitute z = a in the second equation to obtain y= 2 + 2a. Then substitute z = a and 
y = 2 + 2a into the first equation to obtain 

x+2(2+2a)-3a=1 o x+4+4a-3a=1 o x=-—3-a 


Thus, the following is the general solution where a is a parameter: 


x=-3-a, y=2+2a, z=a or u = (—3 — a, 2 + 2a, a) 
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EXAMPLE 3.8 Solve the following system: 


Xi 3x3 2x3 5X4 =4 


2x1 + 8x2 X3 9x4 = 9 
3x, + 5x, — 12x3 + 17x4 = 7 


We use Gaussian elimination. 


Part A. (Forward Elimination) We use the coefficient 1 of x, in the first equation Z, as the pivot in order to 
eliminate x, from the second equation L, and from the third equation L;. This is accomplished by the following 
operations: 


(1) ‘“‘Replace L, by —2Z, + L,” and (2) “‘Replace L, by —3L, + L,” 


These yield: 
xı + 3x, — 2x; + 5x4 = 4 
2x3 + 3x3 — x4 = 
4x, — 6x3 + 2x4 = —5 


We now use the coefficient 2 of x, in the second equation L, as the pivot and the multiplier m = 2 in order to 
eliminate x, from the third equation L3. This is accomplished by the operation ‘‘Replace L} by 2L, + L3,” which 
then yields the degenerate equation 


Ox, + Ox, + Ox; + 0x4 = —3 


This equation and, hence, the original system have no solution: 
DO NOT CONTINUE 
Remark 1: As in the above examples, Part A of Gaussian elimination tells us whether or not the 


system has a solution—that is, whether or not the system is consistent. Accordingly, Part B need never be 
applied when a system has no solution. 


Remark 2: Ifa system of linear equations has more than four unknowns and four equations, then it 


may be more convenient to use the matrix format for solving the system. This matrix format is discussed 
later. 


3.7 Echelon Matrices, Row Canonical Form, Row Equivalence 


One way to solve a system of linear equations is by working with its augmented matrix M rather than the 
system itself. This section introduces the necessary matrix concepts for such a discussion. These 
concepts, such as echelon matrices and elementary row operations, are also of independent interest. 


Echelon Matrices 


A matrix A is called an echelon matrix, or is said to be in echelon form, if the following two conditions 
hold (where a leading nonzero element of a row of A is the first nonzero element in the row): 


(1) All zero rows, if any, are at the bottom of the matrix. 
(2) Each leading nonzero entry in a row is to the right of the leading nonzero entry in the preceding row. 


That is, A = [a,] is an echelon matrix if there exist nonzero entries 


aijo Gaps eae Aj,» where jy <j < <j, 
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with the property that 


Gisr, j<j 


The entries a);,, dyj,,...,4,;, which are the leading nonzero elements in their respective rows, are called 
the pivots of the echelon matrix. 


EXAMPLE 3.9 The following is an echelon matrix whose pivots have been circled: 


00345907 
0000412 5 
A=|0 0 000072 
00000 08 6 
0000000 0 


Observe that the pivots are in columns C3, C4, C6, C}, and each is to the right of the one above. Using the above 
notation, the pivots are 


a, = 2, az, = 3, az, = 5, ay, = 8 


where j =2, fp =4, j3 =6, jy =7. Herer=4. 


Row Canonical Form 


A matrix A is said to be in row canonical form (or row-reduced echelon form) if it is an echelon matrix— 
that is, if it satisfies the above properties (1) and (2), and if it satisfies the following additional two 
properties: 

(3) Each pivot (leading nonzero entry) is equal to 1. 

(4) Each pivot is the only nonzero entry in its column. 

The major difference between an echelon matrix and a matrix in row canonical form is that in an 
echelon matrix there must be zeros below the pivots [Properties (1) and (2)], but in a matrix in row 
canonical form, each pivot must also equal | [Property (3)] and there must also be zeros above the pivots 
[Property (4)]. 

The zero matrix 0 of any size and the identity matrix J of any size are important special examples of 
matrices in row canonical form. 


EXAMPLE 3.10 


The following are echelon matrices whose pivots have been circled: 


320 45 -6 

coer A D2 3 00300 4 
l 0 0 Ol, 00000 -3 

CA E a 00 0 00000 2 

0000 00 0 


The third matrix is also an example of a matrix in row canonical form. The second matrix is not in row canonical 
form, because it does not satisfy property (4); that is, there is a nonzero entry above the second pivot in the third 
column. The first matrix is not in row canonical form, because it satisfies neither property (3) nor property (4); that 
is, some pivots are not equal to 1 and there are nonzero entries above the pivots. 
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Elementary Row Operations 


Suppose A is a matrix with rows R4, R3,- . . , Rm. The following operations on A are called elementary row 
operations. 


E,| (Row Interchange): Interchange rows R, and R,. This may be written as 
“Interchange R, and R,” or “R; — R” 


E,| (Row Scaling): Replace row R; by a nonzero multiple kR, of itself. This may be written as 


“Replace R; by kR; (k 4 0)” or “kR; > R;” 


E;] (Row Addition): Replace row R, by the sum of a multiple kR; of a row R, and itself. This may be 
written as 
“Replace R; by kR; + R;” or “kR; + R; > R,” 


The arrow — in E, and E; may be read as ‘‘replaces.”’ 
Sometimes (say to avoid fractions when all the given scalars are integers) we may apply [E,] and [E3] 
in one step; that is, we may apply the following operation: 


[E] Replace R, by the sum of a multiple kR; of a row R, and a nonzero multiple k’R; of itself. This may 
be written as 


“Replace R, by kR; + k’R; (K 4 0)” or “AR; + KR; > R” 


We emphasize that in operations [E;] and [E] only row R; is changed. 


Row Equivalence, Rank of a Matrix 

A matrix A is said to be row equivalent to a matrix B, written 

A~B 

if B can be obtained from A by a sequence of elementary row operations. In the case that B is also an 
echelon matrix, B is called an echelon form of A. 

The following are two basic results on row equivalence. 
THEOREM 3.7: Suppose A = [a;] and B = [b;] are row equivalent echelon matrices with respective 

pivot entries 


Aljo 2 2j.9 +++ By, and Diko Diggs Bsk, 


Then A and B have the same number of nonzero rows—that is, r = s—and the pivot 
entries are in the same positions—that is, j =k, h=, ..., j, = k. 
THEOREM 3.8: Every matrix A is row equivalent to a unique matrix in row canonical form. 


The proofs of the above theorems will be postponed to Chapter 4. The unique matrix in Theorem 3.8 
is called the row canonical form of A. 
Using the above theorems, we can now give our first definition of the rank of a matrix. 


DEFINITION: The rank of a matrix A, written rank(A), is equal to the number of pivots in an echelon 
form of A. 


The rank is a very important property of a matrix and, depending on the context in which the 
matrix is used, it will be defined in many different ways. Of course, all the definitions lead to the 
same number. 


The next section gives the matrix format of Gaussian elimination, which finds an echelon form of any 
matrix A (and hence the rank of A), and also finds the row canonical form of A. 
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One can show that row equivalence is an equivalence relation. That is, 


(1) A~A for any matrix A. 
(2) If A ~ B, then B ~ A. 
(3) IFA ~B and B ~ C, then A ~ C. 


Property (2) comes from the fact that each elementary row operation has an inverse operation of the same 
type. Namely, 


(i) “Interchange R, and R,”’ is its own inverse. 
(ii) ‘‘Replace R, by kR,’’ and ‘‘Replace R; by (1/k)R;’’ are inverses. 
Gii) “Replace R, by kR; + R; and *‘Replace R; by —kR; + R;’’ are inverses. 


There is a similar result for operation [E] (Problem 3.73). 


3.8 Gaussian Elimination, Matrix Formulation 


This section gives two matrix algorithms that accomplish the following: 


(1) Algorithm 3.3 transforms any matrix A into an echelon form. 
(2) Algorithm 3.4 transforms the echelon matrix into its row canonical form. 


These algorithms, which use the elementary row operations, are simply restatements of Gaussian 
elimination as applied to matrices rather than to linear equations. (The term ‘‘row reduce’’ or simply 
“‘reduce’’ will mean to transform a matrix by the elementary row operations.) 


ALGORITHM 3.3 (Forward Elimination): The input is any matrix A. (The algorithm puts 0’s below 
each pivot, working from the ‘‘top-down.’’) The output is 
an echelon form of A. 


Step 1. Find the first column with a nonzero entry. Let 7, denote this column. 


(a) Arrange so that a); # 0. That is, if necessary, interchange rows so that a nonzero entry 
appears in the first row in column j}. 


(b) Use a); as a pivot to obtain 0’s below aj, 
Specifically, for i > 1: 
(1) Set m = —a,, /ay;,; (2) Replace R; by mR, + R; 
[That is, apply the operation —(a,, /a);,)R) +R; > R;-] 


Step 2. Repeat Step 1 with the submatrix formed by all the rows excluding the first row. Here we let /, 
denote the first column in the subsystem with a nonzero entry. Hence, at the end of Step 2, we 
have a, # 0. 


Steps 3 to r. Continue the above process until a submatrix has only zero rows. 
We emphasize that at the end of the algorithm, the pivots will be 
Aljir Ajar ve + Uy, 


where r denotes the number of nonzero rows in the final echelon matrix. 


Remark 1: The following number m in Step 1(b) is called the multiplier: 


ai _ entry to be deleted 


aij, pivot 
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Remark 2: One could replace the operation in Step 1(b) by the following which would avoid 
fractions if all the scalars were originally integers. 


it 


Replace R; by —a; Ri +a, R 


ALGORITHM 3.4 (Backward Elimination): The input is a matrix A = [a,j] in echelon form with pivot 
entries 


aijo Qj sory Aj, 
The output is the row canonical form of A. 
Step 1. (a) (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row R, by 1/a,; . 
(b) (Use a,, = 1 to obtain 0’s above the pivot.) For i=r—1, r—2, ..., 2, E 


(1) Set m = —a;; ; (2) Replace R; by mR, + R; 


Yr? 


(That is, apply the operations —a; R, + R; > R;.) 
Steps 2 tor—1. Repeat Step 1 for rows R,_,, R,_2,...,Ro. 


Step r. (Use row scaling so the first pivot equals 1.) Multiply R, by 1/ Giz. 


There is an alternative form of Algorithm 3.4, which we describe here in words. The formal 
description of this algorithm is left to the reader as a supplementary problem. 


ALTERNATIVE ALGORITHM 3.4 Puts 0’s above the pivots row by row from the bottom up (rather 
than column by column from right to left). 


The alternative algorithm, when applied to an augmented matrix M of a system of linear equations, is 
essentially the same as solving for the pivot unknowns one after the other from the bottom up. 


Remark: We emphasize that Gaussian elimination is a two-stage process. Specifically, 
Stage A (Algorithm 3.3). Puts 0’s below each pivot, working from the top row R; down. 


Stage B (Algorithm 3.4). Puts 0’s above each pivot, working from the bottom row R, up. 


There is another algorithm, called Gauss—Jordan, that also row reduces a matrix to its row canonical 
form. The difference is that Gauss-Jordan puts 0’s both below and above each pivot as it works its way 
from the top row R, down. Although Gauss-Jordan may be easier to state and understand, it is much less 
efficient than the two-stage Gaussian elimination algorithm. 


12 -3 1 2 
EXAMPLE 3.11 Consider the matrix A= |2 4 —4 6 10 
3 6 -6 9 13 


(a) Use Algorithm 3.3 to reduce A to an echelon form. 
(b) Use Algorithm 3.4 to further reduce A to its row canonical form. 


(a) First use a,;, = 1 as a pivot to obtain 0’s below a, ; that is, apply the operations “‘Replace R, by —2R, + R,”’ 
and ‘‘Replace R, by —3R, + R3.’’ Then use a); = 2 as a pivot to obtain 0 below a3; that is, apply the operation 
“Replace R}, by — 3R, + R3.” This yields 


1 2 -3 1 2 1 2 -3 1 2 
A~|]0 0 2 4 64~]0 0 2 4 6 
0 0 3 6 7 0 0 0 0 -2 


The matrix is now in echelon form. 
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(b) Multiply R; by —} so the pivot entry a3; = 1, and then use a3; = | as a pivot to obtain 0’s above it by the 
operations “‘Replace R, by —6R; + R,” and then ‘‘Replace R; by —2R; + R,.’’ This yields 


1 2 -3 1 2 1 2 -3 1 0 
A~|]0 0 2 4 64~/0 0 2 4 0 
0 0 0 0 1 0 0 0 0 1 


Multiply R, by 5 so the pivot entry a», = 1, and then use az; = 1 as a pivot to obtain 0’s above it by the 
operation “‘Replace R; by 3R, + R.” This yields 


1 2 -3 1 0 12 07 0 
A~|0 0 1 2 0}~}]0 0 1 2 0 
00 0 0 1 000 0 1 


The last matrix is the row canonical form of A. 


Application to Systems of Linear Equations 


One way to solve a system of linear equations is by working with its augmented matrix rather than the 
equations themselves. Specifically, we reduce M to echelon form (which tells us whether the system has a 
solution), and then further reduce M to its row canonical form (which essentially gives the solution of the 
original system of linear equations). The justification for this process comes from the following facts: 


(1) Any elementary row operation on the augmented matrix M of the system is equivalent to applying 
the corresponding operation on the system itself. 


(2) The system has a solution if and only if the echelon form of the augmented matrix M does not have a 
row of the form (0,0,...,0,b) with b £ 0. 


(3) In the row canonical form of the augmented matrix M (excluding zero rows), the coefficient of each 
basic variable is a pivot entry equal to 1, and it is the only nonzero entry in its respective column; 
hence, the free-variable form of the solution of the system of linear equations is obtained by simply 
transferring the free variables to the other side. 


This process is illustrated below. 


EXAMPLE 3.12 Solve each of the following systems: 


xı + xX — 2x; + 4x, =5 xi + xX — 2x; + 3x, = 4 x+2y+z= 3 
2x, + 2x, — 3x; + x4 =3 2x, + 3x, + 3x; — x4 = 3 2x + 5y -z= —4 
3x1 + 3x, — 4x3 — 2x4 = 1 5x, + 7x, +4x3 + x4=5 3x-—2y-—z= 5 
(a) (b) (c) 
(a) Reduce its augmented matrix M to echelon form and then to row canonical form as follows: 
1 1 -2 4 5 1 1 =2 4 5 1 1 0 -10 -9 
M=]|2 2 -3 1 3}~]0 0 1 -7 -7|]~}]0 0 1 -7 -7 
3 3 -4 —2 1 0 0 2 —14 —14 0 0 0 0 0 


Rewrite the row canonical form in terms of a system of linear equations to obtain the free variable form of the 
solution. That is, 


xı +X, — 10x4 =-9 Pe x, = —-9—x,4+ 10x, 
x3 — 7x4 = —7 x3 = —7 + 7x4 


(The zero row is omitted in the solution.) Observe that x, and x3 are the pivot variables, and x, and x4 are the 
free variables. 
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(b) First reduce its augmented matrix M to echelon form as follows: 


1 1 -2 3 4 1 1 -2 3 4 1 1 -2 3 4 
M=|2 3 3 -1 3|~ |O 1 7 = -5S)}~]0 1 7 =] =5 
57 4 1 5 0 2 14 -14 -15 00 0 0 -5 


There is no need to continue to find the row canonical form of M, because the echelon form already tells us that 
the system has no solution. Specifically, the third row of the echelon matrix corresponds to the degenerate 
equation 


Ox, t Ox, } 0x3 t 0x4 = 5 


which has no solution. Thus, the system has no solution. 


(c) Reduce its augmented matrix M to echelon form and then to row canonical form as follows: 


1 2 1 3 1 2 1 3 1 2 1 3 

M= |2 5 -1 —4|~ j0 1 —3 =l0 a 1 -3 —-10 

3-2 -=l 3 0 -8 -4 —4 0 0 —28 —84 
1 2 1 3 1 20 0 1 0 0 2 
~/10 1 -3 —l0j~ |O 1 O —ij~JjO 1 0O -1 
0 0 1 3 0 0 1 3 0 0 1 3 


Thus, the system has the unique solution x = 2, y = —1, z = 3, or, equivalently, the vector u = (2, —1,3). We 
note that the echelon form of M already indicated that the solution was unique, because it corresponded to a 
triangular system. 


Application to Existence and Uniqueness Theorems 


This subsection gives theoretical conditions for the existence and uniqueness of a solution of a system of 
linear equations using the notion of the rank of a matrix. 


THEOREM 3.9: Consider a system of linear equations in n unknowns with augmented matrix 
M = |A, B]. Then, 


(a) The system has a solution if and only if rank(4) = rank(M). 
(b) The solution is unique if and only if rank(4) = rank(M) = n. 


Proof of (a). The system has a solution if and only if an echelon form of M = [A, B] does not have a 
row of the form 


(0,0,...,0,b), with b0 


If an echelon form of M does have such a row, then b is a pivot of M but not of A, and hence, 
rank(M) > rank(A). Otherwise, the echelon forms of Æ and M have the same pivots, and hence, 
rank(A) = rank(/). This proves (a). 


Proof of (b). The system has a unique solution if and only if an echelon form has no free variable. This 
means there is a pivot for each unknown. Accordingly, n = rank(A) = rank(M). This proves (b). 


The above proof uses the fact (Problem 3.74) that an echelon form of the augmented matrix 
M = [A,B] also automatically yields an echelon form of A. 
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3.9 Matrix Equation of a System of Linear Equations 


The general system (3.2) of m linear equations in n unknowns is equivalent to the matrix equation 


X1 
4, 42 Dn X> bı 
a a za A b 
ia: ST ie) = or AX=B 
Ami Am2 amn x b m 


where A = [a,j] is the coefficient matrix, X = [x,] is the column vector of unknowns, and B = [b,] is the 
column vector of constants. (Some texts write Ax = b rather than AX = B, in order to emphasize that x 
and b are simply column vectors.) 

The statement that the system of linear equations and the matrix equation are equivalent means that 
any vector solution of the system is a solution of the matrix equation, and vice versa. 


EXAMPLE 3.13 The following system of linear equations and matrix equation are equivalent: 


xı + 2x, — 4x; + 7x4 = 4 1 2 -4 7 i 4 
3x, — 5x) + 6x3 — 8x4 = 8 and 3 -5 6 -8 =| 8 
Ax, — 3x, — 2x3 + 6x4 = 11 4 -3 -2 6 k 11 

4 


We note that x, =3, x2=1, x;=2, x,=1, or, in other words, the vector u = [3,1,2,1] is a solution of 
the system. Thus, the (column) vector u is also a solution of the matrix equation. 


The matrix form AX = B of a system of linear equations is notationally very convenient when 
discussing and proving properties of systems of linear equations. This is illustrated with our first theorem 
(described in Fig. 3-1), which we restate for easy reference. 


THEOREM 3.1: Suppose the field K is infinite. Then the system AX = B has: (a) a unique solution, (b) 
no solution, or (c) an infinite number of solutions. 


Proof. It suffices to show that if AX = B has more than one solution, then it has infinitely many. 
Suppose u and v are distinct solutions of 4X = B; that is, Au = B and Av = B. Then, for any k € K, 
Alu + k(u — v)| = Au + k(Au — Av) = B + k(B — B) = B 
Thus, for each k € K, the vector u + k(u — v) is a solution of AX = B. Because all such solutions are 
distinct (Problem 3.47), AX = B has an infinite number of solutions. 


Observe that the above theorem is true when K is the real field R (or the complex field C). Section 3.3 
shows that the theorem has a geometrical description when the system consists of two equations in two 
unknowns, where each equation represents a line in R?. The theorem also has a geometrical description 
when the system consists of three nondegenerate equations in three unknowns, where the three equations 
correspond to planes H,, H>, H, in R°. That is, 

(a) Unique solution: Here the three planes intersect in exactly one point. 

(b) No solution: Here the planes may intersect pairwise but with no common point of intersection, or two 
of the planes may be parallel. 

(c) Infinite number of solutions: Here the three planes may intersect in a line (one free variable), or they 
may coincide (two free variables). 


These three cases are pictured in Fig. 3-3. 


Matrix Equation of a Square System of Linear Equations 


A system AX = B of linear equations is square if and only if the matrix A of coefficients is square. In such 
a case, we have the following important result. 
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H,,H,,and H3 


(a) Unique solution (c) Infinite number of solutions 


a (e 


(iii) (iv) 


(i) 


(b) No solutions 


Figure 3-3 


THEOREM 3.10: A square system AX = B of linear equations has a unique solution if and only if the 
matrix A is invertible. In such a case, A~!B is the unique solution of the system. 


We only prove here that if A is invertible, then A~'B is a unique solution. If A is invertible, then 
A(47'B) = (A47')B = IB = B 
and hence, A~!B is a solution. Now suppose v is any solution, so Av = B. Then 
v = Iv = (A7!A)v = A`! (Av) = A'B 
Thus, the solution A7 !B is unique. 


EXAMPLE 3.14 Consider the following system of linear equations, whose coefficient matrix A and 
inverse A~! are also given: 


x+2y+ 3z= 1 2 3 3 -8 3 
x+3y+ 6z=3, A=|1 3 6l, A! =]|-1 7 3 
2x + 6y + 13z = 2 6 13 0 -2 1 


By Theorem 3.10, the unique solution of the system is 


3 -8 3)f1 —6 
A'B=|-1 7 -3|//3]/=] 5 
0-2 ıljļ5 —1 


That is, x = —6, y = 5, z = —1. 


Remark: We emphasize that Theorem 3.10 does not usually help us to find the solution of a square 
system. That is, finding the inverse of a coefficient matrix A is not usually any easier than solving the 
system directly. Thus, unless we are given the inverse of a coefficient matrix A, as in Example 3.14, 
we usually solve a square system by Gaussian elimination (or some iterative method whose discussion 
lies beyond the scope of this text). 
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3.10 Systems of Linear Equations and Linear Combinations of Vectors 


The general system (3.2) of linear equations may be rewritten as the following vector equation: 


aii a12 Ain bı 
x, a) +X, a2 HeH an | _ by 
amı Am2 ann Dm 
Recall that a vector v in K” is said to be a linear combination of vectors u, u), . . . , Um in K” if there exist 
scalars a4, a7,...,da,, in K such that 


U = Aju + azt + `+: + amum 


Accordingly, the general system (3.2) of linear equations and the above equivalent vector equation have a 
solution if and only if the column vector of constants is a linear combination of the columns of the 
coefficient matrix. We state this observation formally. 


THEOREM 3.11: A system AX = B of linear equations has a solution if and only if B is a linear 
combination of the columns of the coefficient matrix A. 


Thus, the answer to the problem of expressing a given vector v in K” as a linear combination of vectors 
U,,Uz,-...,U,, in K” reduces to solving a system of linear equations. 


Linear Combination Example 
Suppose we want to write the vector v = (1, —2, 5) as a linear combination of the vectors 
u; = (1,1,1), u, = (1,2,3), u, = (2, —1,1) 


First we write v = xu, + yu + zu, with unknowns x, y, z, and then we find the equivalent system of linear 
equations which we solve. Specifically, we first write 


1 1 l 2 
-2 | =x| 1| +y|2|+z|-—l1 (*) 
5 1 J 1 
Then 
1 x y 2z x+y+2z 
-2| = |x| + |2| +|—z| = |x+2y-z 
5 x 3y Z x+3y +z 


Setting corresponding entries equal to each other yields the following equivalent system: 


x+ y+2z= 1 
x+2y- z=-2 (**) 
x+3y+ z= 5 


For notational convenience, we have written the vectors in R” as columns, because it is then easier to find 
the equivalent system of linear equations. In fact, one can easily go from the vector equation (*) directly 
to the system (**). 

Now we solve the equivalent system of linear equations by reducing the system to echelon form. This 
yields 


x+ y+2z= 1 x+y+2z= 1 
y—3z=-3 and then y—3z=-3 
2y= z= 4 5z= 10 


Back-substitution yields the solution x = —6, y = 3, z = 2. Thus, v = —6u, + 3u, + 2u3. 
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EXAMPLE 3.15 
(a) Write the vector v = (4,9,19) as a linear combination of 
u, = (1, —2,3), uy = (3, —7, 10), uz = (2,1,9). 


Find the equivalent system of linear equations by writing v = xu, + yu, + zuz, and reduce the system to an 
echelon form. We have 


x+ 3y+2z= 4 x+3y+2z= 4 x+3y+2z= 4 

—2x- Ty+ z= 9 or —y+5z=17 or —y + 5z = 17 

3x + 10y + 9z = 19 y+3z= 7 8z = 24 
Back-substitution yields the solution x = 4, y= —2, z = 3. Thus, v is a linear combination of w,, uy, u3. 


Specifically, v = 4u, — 2u, + 3u3. 
(b) Write the vector v = (2,3,—5) as a linear combination of 
u; = (1,2,-3), u, = (2,3, —4), ux = (1,3, —5) 


Find the equivalent system of linear equations by writing v = xu, + yu, + zuz, and reduce the system to an 
echelon form. We have 


x+2y+ z= 2 x+2y+ z= 2 x+2y+ z= 2 
2x+3y+3z= 3 or —y+ z=-l or —5y+5z=-1 
3x — 4y —5z = -5 2y—2z= 1 0= 3 


The system has no solution. Thus, it is impossible to write v as a linear combination of 14 , v5, u3. 


Linear Combinations of Orthogonal Vectors, Fourier Coefficients 


Recall first (Section 1.4) that the dot (inner) product u - v of vectors u = (a),...,a,) and v = (b),...,5,) 
in R” is defined by 


u: v= a;b, +a,b,+--:+a,b, 


Furthermore, vectors u and v are said to be orthogonal if their dot product u : v = 0. 
Suppose that u, u2, ...,U„ in R” are n nonzero pairwise orthogonal vectors. This means 


(i) u u=0 fori#j and (ii) u;-u; #0 for each i 


Then, for any vector v in R”, there is an easy way to write v as a linear combination of u}, u2, ... , Up, 
which is illustrated in the next example. 


EXAMPLE 3.16 Consider the following three vectors in R3: 
u = (1,1, 1), u = (1, —3,2), u = (5,—1, —4) 
These vectors are pairwise orthogonal; that is, 
uj u, =1-—3+2=0, u-u; =5—1-—4=0, u u; =5+3-8=0 


Suppose we want to write v = (4, 14, —9) as a linear combination of u; , uz, u3. 


Method 1. Find the equivalent system of linear equations as in Example 3.14 and then solve, 
obtaining v = 3u; — 4u, + u3. 


Method 2. (This method uses the fact that the vectors u, u3,u3 are mutually orthogonal, and 
hence, the arithmetic is much simpler.) Set v as a linear combination of w,, u», uz using unknown scalars 
x,y,z as follows: 


(4,14, —9) = x(1, 1,1) + y(1, -3,2) + 2(5, —1, —4) (*) 
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Take the dot product of (*) with respect to u, to get 
(4, 14, -9) - (1,1, 1) =x(1,1,1)- (1,1, 1) or 9 = 3x or x=3 


(The last two terms drop out, because u; is orthogonal to u, and to u3.) Next take the dot product of (*) with respect 
to u, to obtain 


(4, 14, —9) - (1, -3,2) = y(1, —3, 2) - (1, —3, 2) or — 56 = 14y or y=—4 
Finally, take the dot product of (*) with respect to u, to get 
(4, 14, —9) - (5, -1, —4) = 2(5, —1, —4) - (5,-1, —4) or 42 = 422 or z=1 


Thus, v = 3u, — 4u, + u3. 


The procedure in Method 2 in Example 3.16 is valid in general. Namely, 


THEOREM 3.12: Suppose u,,u,,...,u, are nonzero mutually orthogonal vectors in R”. Then, for any 
vector v in R”, 
v-u v-u v- u 
v= ta 2 tig abv ee "Un 
Uj * Uy Uy ` Uy Up ` Up 


We emphasize that there must be n such orthogonal vectors u; in R” for the formula to be used. Note 
also that each u; u; 4 0, because each u; is a nonzero vector. 


Remark: The following scalar k; (appearing in Theorem 3.12) is called the Fourier coefficient of v 
with respect to u;: 


It is analogous to a coefficient in the celebrated Fourier series of a function. 


3.11 Homogeneous Systems of Linear Equations 


A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus, a 
homogeneous system has the form AX = 0. Clearly, such a system always has the zero vector 
0 = (0,0,...,0) as a solution, called the zero or trivial solution. Accordingly, we are usually interested 
in whether or not the system has a nonzero solution. 

Because a homogeneous system AX = 0 has at least the zero solution, it can always be put in an 
echelon form, say 


411X] + AyyX_ + 43X31 aj4X4 FF Ay yXy_ = 


47;,.%j, + 47 jy +1% T° E FnXn = 0 


Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus, 
the echelon system has n — r free variables. 
The question of nonzero solutions reduces to the following two cases: 
(i) r =n. The system has only the zero solution. 
(ii) r < n. The system has a nonzero solution. 


Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r < n, and the 
system has a nonzero solution. This proves the following important result. 


THEOREM 3.13: A homogeneous system AX = 0 with more unknowns than equations has a nonzero 
solution. 
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EXAMPLE 3.17 Determine whether or not each of the following homogeneous systems has a nonzero 
solution: 


x+ y- z=0 x+ y- z=0 xX, + 2x, — 3x3 + 4x4 = 0 

2x—-3y+ z=0 2x+4y— z=0 2x, — 3x, + 5x3 — 7x4 = 0 

x—4y+2z=0 3x+2y+2z=0 5x1 + 6x, — 9x3 + 8x4 = 0 
(a) (b) (c) 


(a) Reduce the system to echelon form as follows: 


x+ y= z=0 
—5y+3z=0 and then 
—Sy + 3z=0 
The system has a nonzero solution, because there are only two equations in the three unknowns in echelon form. 


Here z is a free variable. Let us, say, set z = 5. Then, by back-substitution, y = 3 and x = 2. Thus, the vector 
u = (2,3,5) is a particular nonzero solution. 


x+ y— z=0 
—S5y+3z=0 


(b) Reduce the system to echelon form as follows: 


x+y— z=0 x+y—-z=0 
2y+ z=0 and then 2y+z=0 
—y+5z=0 llz=0 


In echelon form, there are three equations in three unknowns. Thus, the system has only the zero solution. 


(c) The system must have a nonzero solution (Theorem 3.13), because there are four unknowns but only three 
equations. (Here we do not need to reduce the system to echelon form.) 


Basis for the General Solution of a Homogeneous System 


Let W denote the general solution of a homogeneous system AX = 0. A list of nonzero solution vectors 


U,,Uz,...,U, Of the system is said to be a basis for W if each solution vector w € W can be expressed 
uniquely as a linear combination of the vectors u,,u ,...,u,; that is, there exist unique scalars 
a,,ay,...,a, Such that 


w = QU; + azta +-+-+ aU 


The number s of such basis vectors is equal to the number of free variables. This number s is called the 
dimension of W, written as dim W = s. When W = {0}—that is, the system has only the zero solution— 
we define dim W = 0. 

The following theorem, proved in Chapter 5, page 171, tells us how to find such a basis. 


THEOREM 3.14: Let W be the general solution of a homogeneous system AX = 0, and suppose that 
the echelon form of the homogeneous system has s free variables. Let u,,uz,..., Us 
be the solutions obtained by setting one of the free variables equal to 1 (or any 
nonzero constant) and the remaining free variables equal to 0. Then dim W = s, and 
the vectors u, u), ...,u, form a basis of W. 


We emphasize that the general solution W may have many bases, and that Theorem 3.12 only gives us 
one such basis. 


EXAMPLE 3.18 Find the dimension and a basis for the general solution W of the homogeneous system 


xy 2X, — 3x3 +2x4 — 4x; =0 
2x1 4x, 5x3 X4 6x5 =0 
5x, + 10x, — 13x; + 4x4 — 16x; = 0 
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First reduce the system to echelon form. Apply the following operations: 


“Replace L, by —2L; +L,” and “Replace L} by — 5L; + 2,” and then “Replace L} by —2L, + L,” 


These operations yield 


xı + 2x, — 3x3 + 2x4 — 4x5 = 0 
x3 — 3x4 + 2x; = 0 and 
2x; — 6x4 + 4x; = 0 


xı + 2x) — 3x3 + 2x4 — 4x5 = 0 
x3 — 3x4 + 2x; = 0 


The system in echelon form has three free variables, x3, x4, x5; hence, dim W = 3. Three solution vectors that form a 
basis for W are obtained as follows: 


(1) Set x, = 1, x4 = 0, x; = 0. Back-substitution yields the solution u) = (—2, 1,0,0,0). 

(2) Set x, = 0, x, = 1, x; = 0. Back-substitution yields the solution u, = (7,0,3, 1,0). 

(3) Set x, = 0, x4 = 0, x; = 1. Back-substitution yields the solution u, = (—2, 0, —2, 0, 1). 
The vectors u; = (—2, 1,0,0,0), u, = (7,0,3,1,0), u = (—2,0,—2,0,1) form a basis for W. 

Remark: Any solution of the system in Example 3.18 can be written in the form 


au, + bu, + cu; = a(—2, 1,0,0,0) + b(7, 0, 3, 1,0) + c(—2, 0, —2, 0, 1) 
= (-2a+7b—-—2c, a, 3b—2c, b, cœ) 


or 
x, = —2a + 7b — 2c, xX, = 4, x, = 3b — 2c, x4 = b, X5=C 


where a,b,c are arbitrary constants. Observe that this representation is nothing more than the parametric 
form of the general solution under the choice of parameters x, = a, x4 = b, x5 = c. 


Nonhomogeneous and Associated Homogeneous Systems 


Let AX = B be a nonhomogeneous system of linear equations. Then 4X = 0 is called the associated 
homogeneous system. For example, 


x+2y—4z=7 jnd x+2y—4z=0 
3x — 5y + 6z = 8 3x — 5y + 6z=0 


show a nonhomogeneous system and its associated homogeneous system. 
The relationship between the solution U of a nonhomogeneous system AX = B and the solution W of 
its associated homogeneous system AX = 0 is contained in the following theorem. 


THEOREM 3.15: Let vp be a particular solution of AX = B and let W be the general solution of 
AX = 0. Then the following is the general solution of AX = B: 


U=wtW={uy+tw:we W} 


That is, U = v9 + W is obtained by adding vp to each element in W. We note that this theorem has a 
geometrical interpretation in R°. Specifically, suppose W is a line through the origin O. Then, as pictured 
in Fig. 3-4, U = vg + W is the line parallel to W obtained by adding vp to each element of W. Similarly, 
whenever W is a plane through the origin O, then U = vọ + W is a plane parallel to W. 
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3.12 Elementary Matrices 


Let e denote an elementary row operation and let e(4) denote the results of applying the operation e to a 
matrix A. Now let E be the matrix obtained by applying e to the identity matrix /; that is, 


E = e(I) 


Then Æ is called the elementary matrix corresponding to the elementary row operation e. Note that E is 
always a square matrix. 


EXAMPLE 3.19 Consider the following three elementary row operations: 


(1) Interchange R, and R3. (2) Replace R, by —6R3. (3) Replace R; by — 4R, + R3. 


The 3 x 3 elementary matrices corresponding to the above elementary row operations are as follows: 


100 1 00 100 
E,=|0 0 1l, E,=|0 -6 0l, E=| 010 
010 0 01 401 


The following theorem, proved in Problem 3.34, holds. 


THEOREM 3.16: Let e be an elementary row operation and let E be the corresponding m x m 
elementary matrix. Then 


e(A) = EA 
where A is any m x n matrix. 


In other words, the result of applying an elementary row operation e to a matrix A can be obtained by 
premultiplying A by the corresponding elementary matrix £E. 
Now suppose e’ is the inverse of an elementary row operation e, and let E’ and E be the corresponding 
matrices. We note (Problem 3.33) that E is invertible and Z’ is its inverse. This means, in particular, that 
any product 


P= E... BE; 


of elementary matrices is invertible. 
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Applications of Elementary Matrices 


Using Theorem 3.16, we are able to prove (Problem 3.35) the following important properties of matrices. 


THEOREM 3.17: Let A be a square matrix. Then the following are equivalent: 
(a) A is invertible (nonsingular). 
(b) A is row equivalent to the identity matrix 7. 
(c) A is a product of elementary matrices. 


Recall that square matrices A and B are inverses if AB = BA = I. The next theorem (proved in 
Problem 3.36) demonstrates that we need only show that one of the products is true, say AB = I, to prove 
that matrices are inverses. 


THEOREM 3.18: Suppose AB = I. Then BA = J, and hence, B = A` !. 
Row equivalence can also be defined in terms of matrix multiplication. Specifically, we will prove 
(Problem 3.37) the following. 


THEOREM 3.19: B is row equivalent to A if and only if there exists a nonsingular matrix P such that 
B= PA. 


Application to Finding the Inverse of an n x n Matrix 
The following algorithm finds the inverse of a matrix. 


ALGORITHM 3.5: The input is a square matrix A. The output is the inverse of A or that the inverse 
does not exist. 


Step1. Form then x 2n (block) matrix M = [A, I], where A is the left half of M and the identity matrix 
I is the right half of M. 


Step 2. Row reduce M to echelon form. If the process generates a zero row in the A half of M, then 
STOP 
A has no inverse. (Otherwise A is in triangular form.) 
Step 3. Further row reduce M to its row canonical form 
M ~ |I, B] 
where the identity matrix J has replaced A in the left half of M. 
Step 4. Set AT! = B, the matrix that is now in the right half of M. 


The justification for the above algorithm is as follows. Suppose A is invertible and, say, the sequence 
of elementary row operations e,,e,,...,¢, applied to M = [4, I] reduces the left half of M, which is A, to 
the identity matrix /. Let E; be the elementary matrix corresponding to the operation e;. Then, by 
applying Theorem 3.16. we get 


E,- -E564 =I or (E,..-E,E\ IA =I, so A = E; -EEI 
That is, A~! can be obtained by applying the elementary row operations e}, €z,... ,€, to the identity 
matrix J, which appears in the right half of M. Thus, B = A!, as claimed. 
EXAMPLE 3.20 


1 0 2 
Find the inverse of the matrix A = f -1 5| ; 
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First form the (block) matrix M = [A,I] and row reduce M to an echelon form: 


100 21i d 1 0 2' 10 0 
o 1 Ol el SS 1 of~ l0 -1i -1-2 1 0 
001 0'-4 0 0 0 -1'-6 1 1 


In echelon form, the left half of M is in triangular form; hence, A has an inverse. Next we further row reduce M to its 
row canonical form: 


1 0 0; 11 2 9 1 0 0—11 2 2 
M~|0 -1 0, 4 0 -1/~/]0 1 0:1 -4 0 1 
0 0 1! 6 -1 -l 0 0 1! 6 -1 -l 


The identity matrix is now in the left half of the final matrix; hence, the right half is 4~!. In other words, 


-11 2 2 
A =| -4 0 1 
6 -1 -1 


Elementary Column Operations 


Now let A be a matrix with columns C}, C3,...,C,„. The following operations on A, analogous to the 
elementary row operations, are called elementary column operations: 
[F,] (Column Interchange): Interchange columns C; and C). 
[F,] (Column Scaling): Replace C; by kC; (where k # 0). 
[F;] (Column Addition): Replace C; by kC; + C;. 
We may indicate each of the column operations by writing, respectively, 


(1) CŒ e C; (2) kC; > C,, (3) (AC; +C) > C 


Moreover, each column operation has an inverse operation of the same type, just like the corresponding 
row operation. 

Now let f denote an elementary column operation, and let F be the matrix obtained by applying f to 
the identity matrix /; that is, 


F=f() 


Then F is called the elementary matrix corresponding to the elementary column operation f. Note that F 
is always a square matrix. 


EXAMPLE 3.21 
Consider the following elementary column operations: 


(1) Interchange C, and C3; (2) Replace C; by —2C;; (3) Replace C; by —3C, + C; 
The corresponding three 3 x 3 elementary matrices are as follows: 
0 0 1 1 0 0 1 0 0 
Fi=|0 1 Of, F,=|0 1 0}, F;=)}0 1 -3 
1 0 0 0 0 -2 0 0 1 


The following theorem is analogous to Theorem 3.16 for the elementary row operations. 


THEOREM 3.20: For any matrix A, f(A) = AF. 


That is, the result of applying an elementary column operation f on a matrix A can be obtained by 
postmultiplying A by the corresponding elementary matrix F. 
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Matrix Equivalence 


A matrix B is equivalent to a matrix A if B can be obtained from A by a sequence of row and column 
operations. Alternatively, B is equivalent to A, if there exist nonsingular matrices P and Q such that 
B = PAQ. Just like row equivalence, equivalence of matrices is an equivalence relation. 


The main result of this subsection (proved in Problem 3.38) is as follows. 
THEOREM 3.21: Every m x n matrix A is equivalent to a unique block matrix of the form 
I 0 
0 0 


where J, is the r-square identity matrix. 


The following definition applies. 


DEFINITION: The nonnegative integer r in Theorem 3.18 is called the rank of A, written rank(A). 
Note that this definition agrees with the previous definition of the rank of a matrix. 


3.13 LU DECOMPOSITION 


Suppose A is a nonsingular matrix that can be brought into (upper) triangular form U using only row- 
addition operations; that is, suppose A can be triangularized by the following algorithm, which we write 
using computer notation. 


ALGORITHM 3.6: The input is a matrix A and the output is a triangular matrix U. 
Step 1. Repeat for i = 1,2,...,n— 1: 
Step 2. Repeat for 7=i+1,i+2,...,n 


(a) Set my := —ay/ajz. 


(b) Set R;:= mR, +R; 
[End of Step 2 inner loop.] 


[End of Step 1 outer loop.] 


The numbers m,; are called multipliers. Sometimes we keep track of these multipliers by means of the 
following lower triangular matrix L: 


1 0 0 0 0 

=M] 1 0 0 0 

L= —M3] —M39 1 0 0 
Ma Mn My3 ~My n-1 1 


That is, L has 1’s on the diagonal, 0’s above the diagonal, and the negative of the multiplier m, as its 
ij-entry below the diagonal. 

The above matrix L and the triangular matrix U obtained in Algorithm 3.6 give us the classical LU 
factorization of such a matrix A. Namely, 


THEOREM 3.22: Let A be a nonsingular matrix that can be brought into triangular form U using only 
row-addition operations. Then A = LU, where L is the above lower triangular matrix 
with 1’s on the diagonal, and U is an upper triangular matrix with no 0’s on the 
diagonal. 
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1 2 -3 
EXAMPLE 3.22 Supposed = | —3 —4 13 |.Wenote that A may be reduced to triangular form by the operations 
2 1 —-S5 
“Replace R, by 3R; + Ro”; “Replace R, by —2R,+R;”; andthen “Replace R, by 3Ry +R,” 
That is, 
1 2 -3 1 2 -3 
Aw 2 4;/~1/]0 2 4 
0 -3 1 00 7 


This gives us the classical factorization A = LU, where 


1 0 0 12 -3 
L=|-3 1 0 and U=|0 2 4 
2 -3 1 0 0 7 


We emphasize: 
(1) The entries —3, 2, -3 in L are the negatives of the multipliers in the above elementary row operations. 


(2) U is the triangular form of A. 


Application to Systems of Linear Equations 


Consider a computer algorithm M. Let C(n) denote the running time of the algorithm as a function of the 
size n of the input data. [The function C(n) is sometimes called the time complexity or simply the 
complexity of the algorithm M.] Frequently, C(n) simply counts the number of multiplications and 
divisions executed by M, but does not count the number of additions and subtractions because they take 
much less time to execute. 
Now consider a square system of linear equations AX = B, where 
A=[a 


X = þa., B = [b,,...,,]" 


jl» 
and suppose A has an LU factorization. Then the system can be brought into triangular form (in order to 
apply back-substitution) by applying Algorithm 3.6 to the augmented matrix M = [A, B] of the system. 
The time complexity of Algorithm 3.6 and back-substitution are, respectively, 


C(n) ~in? and  C(n) ~in? 


where n is the number of equations. 

On the other hand, suppose we already have the factorization Æ = LU. Then, to triangularize the 
system, we need only apply the row operations in the algorithm (retained by the matrix L) to the column 
vector B. In this case, the time complexity is 


C(n) x 4n? 


Of course, to obtain the factorization A = LU requires the original algorithm where C(n) ~ 4 3. Thus, 
nothing may be gained by first finding the LU factorization when a single system is involved. However, 
there are situations, illustrated below, where the LU factorization is useful. 

Suppose, for a given matrix A, we need to solve the system 


AX =B 
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repeatedly for a sequence of different constant vectors, say B4, B2, ..., Bp. Also, suppose some of the B; 
depend upon the solution of the system obtained while using preceding vectors B,. In such a case, it is 
more efficient to first find the LU factorization of A, and then to use this factorization to solve the system 
for each new B. 


EXAMPLE 3.23 Consider the following system of linear equations: 


x+ 2y+ z=k, 1 2 1 kı 
2x+ 3y+3z=k or AX =B, where A= 2 3 3 and B= | k, 
3x + 10y + 2z = k =3 10 2 kz 


Suppose we want to solve the system three times where B is equal, say, to Bı, B3, B3. Furthermore, suppose 
B, = [1,1,1]”, and suppose 
Buy =B +X, (forj = 1,2) 


where X; is the solution of 4X = B;. Here it is more efficient to first obtain the LU factorization of A and then use the 
LU factorization to solve the system for each of the B’s. (This is done in Problem 3.42.) 


SOLVED PROBLEMS 


Linear Equations, Solutions, 2 x 2 Systems 

3.1. Determine whether each of the following equations is linear: 
(a) 5x+ 7y — 8yz = 16, (b) x+7y+ez=logS, (c) 3x+ky—8z= 16 
(a) No, because the product yz of two unknowns is of second degree. 


(b) Yes, because z,e, and log 5 are constants. 


(c) As it stands, there are four unknowns: x, y, z, k. Because of the term ky it is not a linear equation. However, 
assuming k is a constant, the equation is linear in the unknowns x, y, z. 


3.2. Determine whether the following vectors are solutions of x, + 2x, — 4x; + 3x, = 15: 
(a) u = (3,2, 1,4) and (b) v = (1,2,4,5). 
(a) Substitute to obtain 3 + 2(2) — 4(1) + 3(4) = 15, or 15 = 15; yes, it is a solution. 
(b) Substitute to obtain 1 + 2(2) — 4(4) + 3(5) = 15, or 4 = 15; no, it is not a solution. 


3.3. Solve (a) ex=7, (b) 3x-4-—x=2x-+3, (c) 7+2x—-4=3x+3-x 
(a) Because e 4 0, multiply by 1/e to obtain x = r/e. 
(b) Rewrite in standard form, obtaining 0x = 7. The equation has no solution. 


(c) Rewrite in standard form, obtaining 0x = 0. Every scalar k is a solution. 


3.4. Prove Theorem 3.4: Consider the equation ax = b. 


(i) If a + 0, then x = b/a is a unique solution of ax = b. 
(ii) If a= 0 but b Æ 0, then ax = b has no solution. 
(iii) If a= 0 and b = 0, then every scalar k is a solution of ax = b. 


Suppose a # 0. Then the scalar b/a exists. Substituting b/a in ax = b yields a(b/a) = b, or b= b; 
hence, b/a is a solution. On the other hand, suppose x, is a solution to ax = b, so that axy = b. Multiplying 
both sides by 1/a yields x) = b/a. Hence, b/a is the unique solution of ax = b. Thus, (i) is proved. 

On the other hand, suppose a = 0. Then, for any scalar k, we have ak = Ok = 0. If b Æ 0, then ak F b. 
Accordingly, k is not a solution of ax = b, and so (ii) is proved. If b = 0, then ak = b. That is, any scalar k is 
a solution of ax = b, and so (iii) is proved. 
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3.5. 


3.6. 


Solve each of the following systems: 
e O45 bo 
(a) Eliminate x from the equations by forming the new equation L = —3L, + 2L,. This yields the equation 
23y = —23, and so y=-l 
Substitute y = —1 in one of the original equations, say L4, to get 
2x — 5(-1) = 11 or 2x+5=11 or 2x = 6 or x=3 

Thus, x = 3, y = —1 or the pair u = (3, —1) is the unique solution of the system. 

(b) Eliminate x from the equations by forming the new equation L = 3L, + L,. This yields the equation 
Ox + Oy = 30 


This is a degenerate equation with a nonzero constant; hence, this equation and the system have no 
solution. (Geometrically, the lines corresponding to the equations are parallel.) 


(c) Eliminate x from the equations by forming the new equation L = 2L, + L}. This yields the equation 
Ox + Oy = 0 


This is a degenerate equation where the constant term is also zero. Thus, the system has an infinite 
number of solutions, which correspond to the solution of either equation. (Geometrically, the lines 
corresponding to the equations coincide.) 


To find the general solution, set y = a and substitute in L, to obtain 
2x — 3a = 8 or 2x = 3a +8 or x=3at+4 
Thus, the general solution is 
x=3a+4, y=a or u= a+ 4, a) 


where a is any scalar. 


Consider the system 
x+ay=4 


ax + 9y=b 
(a) For which values of a does the system have a unique solution? 
(b) Find those pairs of values (a,b) for which the system has more than one solution. 


(a) Eliminate x from the equations by forming the new equation L = —aL, + L,. This yields the equation 
(9 — @)y = b — 4a (1) 
The system has a unique solution if and only if the coefficient of y in (1) is not zero—that is, if 
9-a@ 40 or ifa #3. 
(b) The system has more than one solution if both sides of (1) are zero. The left-hand side is zero when 
a = +3. When a = 3, the right-hand side is zero when b — 12 = 0 or b = 12. When a = —3, the right- 


hand side is zero when b + 12 — 0 or b = —12. Thus, (3,12) and (—3,—12) are the pairs for which the 
system has more than one solution. 


Systems in Triangular and Echelon Form 


3.7. 


Determine the pivot and free variables in each of the following systems: 


2x1 — 3x, — 6x3 — 5x4 + 2x; = 7 2x—6y+7z=1 x+2y=3z=2 
x3 + 3x4 — 7x5 = 6 4y+3z=8 2x+3y+ z=4 
X4—2x5=1 2z=4 3x+4y+5z = 8 

(a) (b) (c) 


(a) In echelon form, the leading unknowns are the pivot variables, and the others are the free variables. Here 
X1, X3, X4 are the pivot variables, and x, and x; are the free variables. 
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3.8. 


3.9. 


3.10. 


(b) The leading unknowns are x,y,z, so they are the pivot variables. There are no free variables (as in any 
triangular system). 


(c) The notion of pivot and free variables applies only to a system in echelon form. 


Solve the triangular system in Problem 3.7(b). 
Because it is a triangular system, solve by back-substitution. 
(i) The last equation gives z = 2. 
(ii) Substitute z = 2 in the second equation to get 4y + 6 = 8 or y=}. 
(iii) Substitute z = 2 and y= 5 in the first equation to get 
1 
2x—6(5) +70) =1 or 2x+11=1 o x=-5 
Thus, x= -—5, y= Ss z=2 oru= (—5,4,2) is the unique solution to the system. 
Solve the echelon system in Problem 3.7(a). 
Assign parameters to the free variables, say x, = a and x; = b, and solve for the pivot variables by back- 
substitution. 
(i) Substitute x, = b in the last equation to get x4 — 2b = 1 orx, = 2b + 1. 
(ii) Substitute x; = b and x4 = 2b + 1 in the second equation to get 
x; +3(2b+ 1) -7b=6 or x,3-b+3=6 or x3 =b+3 
(iii) Substitute x,;=b,x,=2b4+1, x3=b+3, x, =a inthe first equation to get 
2x, — 3a — 6(b + 3) — 5(2b + 1) + 2b = 7 o 2x — 3a- 14b- 23 =7 
or xı =3a+7b+15 
Thus, 


3 
xı = 5at 7b + 15, X, =a, x3 =6b+3, Xy=2b+1, x,=b 


or u= (Za+ +15, a, b+3, 2b+1, ») 


is the parametric form of the general solution. 
Alternatively, solving for the pivot variable x,,x3,x, in terms of the free variables x, and x; yields the 
following free-variable form of the general solution: 


3 
xı = 5X2 + Ts + 15, X3 = X5; + 3, x4 = 2x5 + 1 


Prove Theorem 3.6. Consider the system (3.4) of linear equations in echelon form with r equations 
and n unknowns. 


(i) Ifr =n, then the system has a unique solution. 


(ii) Ifr < n, then we can arbitrarily assign values to the n — r free variable and solve uniquely for 
the r pivot variables, obtaining a solution of the system. 


(i) Suppose r = n. Then we have a square system AX = B where the matrix A of coefficients is (upper) 
triangular with nonzero diagonal elements. Thus, A is invertible. By Theorem 3.10, the system has a 
unique solution. 


(ii) Assigning values to the n — r free variables yields a triangular system in the pivot variables, which, by 
(i), has a unique solution. 
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Gaussian Elimination 


3.11. Solve each of the following systems: 


(a) 


(b) 


(c) 


x+2y—4z= —4 x+2y—3z=-1 x+2y-— 3z=1 

2x + 5y — 9z = —10 3x+ y-2z=-7 2x+5y— 8z=4 

3x—2y+3z= Il 5x+3y—-—4z= 2 3x+ 8y — 13z=7 
(a) (b) (c) 


Reduce each system to triangular or echelon form using Gaussian elimination: 


Apply ‘‘Replace L, by —2L, + L,” and ‘‘Replace L} by —3L, + L,”’ to eliminate x from the second and 
third equations, and then apply ‘‘Replace L, by 8L, + L,” to eliminate y from the third equation. These 
operations yield 


x+2y— 4z = —4 x+ 2y — 4z = —4 
y= z=>-2 and then yo z=-2 
—8y + 15z = 23 z= 7 


The system is in triangular form. Solve by back-substitution to obtain the unique solution 
u = (2,-1,1). 


Eliminate x from the second and third equations by the operations ‘“‘Replace L, by 3L, + L2” and 
“Replace L; by —5L, + L3.” This gives the equivalent system 


x+2y= 3z= =l 
7y — 11z = —10 
—7y+1llz= 7 


The operation ‘‘Replace L, by L, + L}” yields the following degenerate equation with a nonzero 
constant: 
Ox + Oy + 0z = —3 
This equation and hence the system have no solution. 
Eliminate x from the second and third equations by the operations ‘‘Replace L, by —2L, + L,” and 
“Replace L, by —3L, + L3.” This yields the new system 
x+2y—3z=1 
y—2z=2 
2y—4z=4 


x+2y—3z=1 
y—2z=2 


(The third equation is deleted, because it is a multiple of the second equation.) The system is in echelon 
form with pivot variables x and y and free variable z. 

To find the parametric form of the general solution, set z= a and solve for x and y by back- 
substitution. Substitute z = a in the second equation to get y= 2 + 2a. Then substitute z = a and 
y =2 + 2a in the first equation to get 


x+2(2+2a)—-3a=1 o x+4+a=1 o x=-3-a 
Thus, the general solution is 
x=-3-a, y=2+4+2a, z=a or u=(-3-a, 2+4+2a, a) 


where a is a parameter. 


3.12. Solve each of the following systems: 


xı — 3x, + 2x3 — x4 +2x5 = 2 xy 2x3 — 3x3 + 4x4 = 2 
3x, — 9x3 + 7x3 — x4 + 3x; = 7 2x, + 5x, — 2x; + x4 = 
2x, — 6x, + 7x3 + 4x4 — 5x; = 7 5x, + 12x, — 7x3 + 6x4 = 3 

(a) (b) 


Reduce each system to echelon form using Gaussian elimination: 
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(a) 


(b) 


Apply ‘‘Replace L, by —3L, + L,” and ‘‘Replace L} by —2L, + L,”’ to eliminate x from the second and 
third equations. This yields 


xı — 3x, + 2x3 — X4+ 2x5 = 2 
X3 + 2x4 — 3x5 = 1 or 
3x3 + 6x4 — 9x5 = 3 


xı — 3x, + 2x3; — x4 + 2x5 = 2 
X3 + 2x4 — 3x5 = 1 


(We delete L}, because it is a multiple of L,.) The system is in echelon form with pivot variables x, and 
x3 and free variables x5, x4,x5. 

To find the parametric form of the general solution, set x. = a, x4 = b, x5 = c, where a,b,c are 
parameters. Back-substitution yields x, = 1 — 2b + 3c and x; = 3a + 5b — 8c. The general solution is 


x, = 3a + 5b — 8c, x, =a, x3 = 1 — 2b + 3c, xy =b, x5 =c 


or, equivalently, u = (3a + 5b — 8c, a, 1 —2b+ 3c, b, c). 


Eliminate x, from the second and third equations by the operations ‘“‘Replace L, by —2L, + L,” and 
“Replace L; by —5L, + L3.” This yields the system 


x) + 2X = 3x3 + 4x4 2 
X +4x;— 7x4 = —3 
2x7 + 8x3 = 14x4 =—-7 


The operation ‘‘Replace L}, by —2L, + L,” yields the degenerate equation 0 = —1. Thus, the system 
has no solution (even though the system has more unknowns than equations). 


3.13. Solve using the condensed format: 


2y+3z= 3 
x+ y+ z= 4 
4x + 8y — 3z = 35 


Number Equation Operation 
(2) (1) 2y+ 3z= 3 L eL, 
(1) (2) x+ y+ z= 4 L ol, 
(3) 4x+8y— 3z=35 
(3') 4y— 7z=19 Replace L, by — 4L, + L; 
(3”) — 13z = 13 Replace L, by — 2L, + L, 


Here (1), (2), and (3”) form a triangular system. (We emphasize that the interchange of L, and L, is 
accomplished by simply renumbering L, and L, as above.) 


Using back-substitution with the triangular system yields z = —1 from Z;, y = 3 from L,, and x = 2 


from L,. Thus, the unique solution of the system is x = 2, y = 3, z = —1 or the triple u = (2,3, —1). 


3.14. Consider the system 


x+2y+ z= 3 
ay + 5z = 10 
2x+7y+az= b 


(a) Find those values of a for which the system has a unique solution. 


(b) 


Find those pairs of values (a, b) for which the system has more than one solution. 


Reduce the system to echelon form. That is, eliminate x from the third equation by the operation 


“Replace L, by —2L; +L,” and then eliminate y from the third equation by the operation 
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“Replace L, by —3L, + aL3.’’ This yields 


x+2y + z=3 x+2y+ z=3 
ay +5z= 10 and then ay+5z= 10 
3y+ (a—2)z=b-6 (a? — 2a — 15)z = ab — 6a — 30 


Examine the last equation (a? — 2a — 15)z = ab — 6a — 30. 
(a) The system has a unique solution if and only if the coefficient of z is not zero; that is, if 


a’ — 2a — 15 = (a — 5)(a +3) #0 or a5 and af-3. 


(b) The system has more than one solution if both sides are zero. The left-hand side is zero when a = 5 or 
a = —3. When a = 5, the right-hand side is zero when 5b — 60 = 0, or b = 12. When a = —3, the right- 
hand side is zero when —3b — 12 = 0, or b = —4. Thus, (5, 12) and (—3, —4) are the pairs for which the 
system has more than one solution. 


Echelon Matrices, Row Equivalence, Row Canonical Form 


3.15. Row reduce each of the following matrices to echelon form: 


1 2 -3 0 -4 1 —6 
(a) A=|2 4 2 2|, () B=| 12 -5 
3 6 —4 3 6 3 -4 


(a) Use aj; =1 as a pivot to obtain 0’s below a,,; that is, apply the row operations ‘‘Replace R, by 
—2R, + R,” and ‘‘Replace R, by —3R, + R3.” Then use a; = 4 as a pivot to obtain a 0 below a,;; that 
is, apply the row operation ‘‘Replace R} by —5R, + 4R;.’’ These operations yield 


1 2 -3 0 1 2 -3 0 
A~|]0 0 4 2}~]0 0 4 2 
00 5 3 00 0 2 


The matrix is now in echelon form. 


(b) Hand calculations are usually simpler if the pivot element equals 1. Therefore, first interchange R, and R3. 
Next apply the operations ‘‘Replace R, by 4R; + R,” and ‘‘Replace R}, by —6R, + R3’’; and then apply 
the operation ‘‘Replace R} by R, + R3.’’ These operations yield 


1 2 -5 1 2 = I 2 =5 
B~ |—4 1 -6}~ 140 9 —26|~ |0 9 —26 
6 3 —4 0 -9 26 0 0 0 


The matrix is now in echelon form. 


3.16. Describe the pivoting row-reduction algorithm. Also describe the advantages, if any, of using this 
pivoting algorithm. 


The row-reduction algorithm becomes a pivoting algorithm if the entry in column j of greatest absolute 
value is chosen as the pivot a;;, and if one uses the row operation 
(—ay, [ay )Ry +R; > R; 


The main advantage of the pivoting algorithm is that the above row operation involves division by the 
(current) pivot a,;,, and, on the computer, roundoff errors may be substantially reduced when one divides by 
a number as large in absolute value as possible. 


3.17. Let A = | -3 6 0 —l1 |. Reduce A to echelon form using the pivoting algorithm. 
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3.18. 


—_<_> 


First interchange R, and R, so that —3 can be used as the pivot, and then apply the operations “‘Replace R, 
by 2R + R,” and ‘‘Replace R; by iR; + R3.” These operations yield 


-3 6 0 -i -3 6 0 -l 
Are) 2-2 2 thal 0 2 2 4 
1 -7 10 2 0-5 10 3 


3 
Now interchange R, and R, so that —5 can be used as the pivot, and then apply the operation ‘‘Replace R, by 
ZR, + R3.” We obtain 


-3 6 0 -I -3 6 0 -1 
A~| 0 -5 10 3}/~] 0 -5 10 3 
0 2 2 4 0 0 6 1 


The matrix has been brought to echelon form using partial pivoting. 


Reduce each of the following matrices to row canonical form: 
22 -1 6 4 5 -9 6 
(a) A=|4 4 1 10 13), (b) B=|]0 2 3 
8 8 -1 26 23 0 07 


(a) First reduce A to echelon form by applying the operations ‘‘Replace R, by —2R, + R,” and ‘‘Replace R, 
by —4R, + R3,” and then applying the operation ‘‘Replace R, by —R, + R3.’’ These operations yield 


2 2 =l 6 4 2 2 =l 6 4 
A~!}0 0 3 -2 5}~]/0 0 3 -2 5 
00 3 2 7 00 0 4 2 


Now use back-substitution on the echelon matrix to obtain the row canonical form of A. Specifically, 
first multiply R, by 1 to obtain the pivot a3, = 1, and then apply the operations “‘Replace R, by 
2R, + R,” and ‘“‘Replace R; by —6R, + R1.” These operations yield 


2 2 -1 6 4 2 2 -1 0 1 
A~|0 0 3 -2 5|/~|0 0 306 
00 0 1 35] 00 O01 3 


Now multiply R, by i, making the pivot a), = 1, and then apply ‘“‘Replace R; by R, + R,,”’ yielding 


22 0 0 
~}]0 0 1 0 
000 1 


N= N WwW 


Finally, multiply R, by 5; so the pivot a,,; = 1. Thus, we obtain the following row canonical form of A: 


1 1 0 0 
A~|0 0 1 0 
000 1 


NIF NNW 


(b) Because B is in echelon form, use back-substitution to obtain 
5 —9 6 5 -9 0 5 -9 0 5 0 0 1 0 0 
B~ |O 2 31~ 10 2 0|~1]0 1 Of~ JO 1 OJ ~1]0 1 0 
0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 


The last matrix, which is the identity matrix J, is the row canonical form of B. (This is expected, because 
B is invertible, and so its row canonical form must be Z.) 


3.19. Describe the Gauss—Jordan elimination algorithm, which also row reduces an arbitrary matrix A to 


its row canonical form. 
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The Gauss-Jordan algorithm is similar in some ways to the Gaussian elimination algorithm, except that 
here each pivot is used to place 0’s both below and above the pivot, not just below the pivot, before working 
with the next pivot. Also, one variation of the algorithm first normalizes each row—that is, obtains a unit 


pivot—before it is used to produce 0’s in the other rows, rather than normalizing the rows at the end of the 
algorithm. 


1 =2 3 1 2 
3.20. Let A= | 1 1 4 -1 3 


2 5 9 -2 8 


Use a,, = 1 as a pivot to obtain 0’s below a,, by applying the operations ‘‘Replace R, by —R, + R?” 
and ‘‘Replace R}, by —2R, + R3.” This yields 


. Use Gauss—Jordan to find the row canonical form of A. 


1 =2 3 1 2 
A~|]0 3 1 -2 1 
0 9 3 -4 4 


Multiply R, by $ to make the pivot a), = 1, and then produce 0’s below and above a, by applying the 
operations ‘‘Replace R, by —9R, + R3” and ‘“‘Replace R; by 2R, + R,.’’ These operations yield 


1-23 12 10 u -1 8 
Aw fo 1 $ -3 ¥][~for 4-34 
0 93 -4 4 000 21 


Finally, multiply R; by 4 to make the pivot a34 = 1, and then produce 0’s above a3, by applying the 
operations ‘‘Replace R, by 2R; + R,” and ‘‘Replace R, by +R; + R,.’’ These operations yield 


10%- pogyog 
A~}]oO 1 4-2 ija jo 1 40 2 
00 0 1 5 00 01 5 


which is the row canonical form of A. 


Systems of Linear Equations in Matrix Form 
3.21. Find the augmented matrix M and the coefficient matrix A of the following system: 


x+2y—3z=4 
3y —4z+ 7x =5 
6z+ 8x —9y = 1 


First align the unknowns in the system, and then use the aligned system to obtain M and A. We have 


x+2y—3z=4 il 2-3 4 Lo 2 3 
Tx+3y—4z=5; then M= ]|7 3 —4 5 and A=|7 3 —4 
8&x— 9y + 6z = 1 8 —9 6 1 8 —9 6 
3.22. Solve each of the following systems using its augmented matrix M: 
x+2y-— z= 3 x—2y+4z=2 x+ y+3z=1 
x+3y+ z= 5 2x —3y+5z=3 2x+3y- z= 
3x + 8y+4z=17 3x — 4y + 6z = 7 5x+7Ty+ z=7 
(a) (b) (c) 
(a) Reduce the augmented matrix M to echelon form as follows: 
L-2 Sk, -3 1 2 -1 3 1 2 -1 3 
M=]1 3 1 5}|~}]0 1 2 2)}~1/0 1 2 
3 8 4 17 02 #7 8 00 3 4 
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Now write down the corresponding triangular system 


x+2y—z=3 
y+2z=2 
3z=4 
and solve by back-substitution to obtain the unique solution 
rey y=- z= o u=- 
Alternately, reduce the echelon form of M to row canonical form, obtaining 
1 2 -1 3 1 2 0 B 1 0 0 = 
Me~|0 1 2 2/410 1 0 =3/~|0 1 0 =3 
4 
0 0 1 3 0 0 1 $ 0 0 1 $ 
This also corresponds to the above solution. 
(b) First reduce the augmented matrix M to echelon form as follows: 
1 -2 4 2 1 -2 4 2 1 -—2 4 2 
M=]|2 -3 5 3|~]0 1 -3 -lj~]0 1 -3 -l 
3 -4 6 7 0 2 —6 1 0 0 0 3 


The third row corresponds to the degenerate equation 0x + 0y + 0z = 3, which has no solution. Thus, 
“DO NOT CONTINUE.” The original system also has no solution. (Note that the echelon form 
indicates whether or not the system has a solution.) 


(c) Reduce the augmented matrix M to echelon form and then to row canonical form: 


1 1 3 1 1 1 3 1 
M=]|2 3 -1 3}~]0 1 -7 1 TE | 
5 7 1 7 0 2 -14 2 


(The third row of the second matrix is deleted, because it is a multiple of the second row and will result 
in a zero row.) Write down the system corresponding to the row canonical form of M and then transfer 
the free variables to the other side to obtain the free-variable form of the solution: 


x+10z=0 d x= —10z 
y— Tz=1 an y=1+7z 


Here z is the only free variable. The parametric solution, using z = a, is as follows: 


x=-10a, y=1+7a, z=a or u = (—10a, 1+7a, a) 


3.23. Solve the following system using its augmented matrix M: 


X, + 2x, — 3x3 — 2x4 + 4x; = 1 
2x, + 5x, — 8x3 — x4 + 6x; = 4 
x, + 4x — 7x3 + 5x4 + 2x; = 8 


Reduce the augmented matrix M to echelon form and then to row canonical form: 


12 -3 -2 4 1 12 -3 -2 4 1 12 -3 -2 4 1 
M=]|2 5 -8 -1 6 4}/~/]0 1 -2 3 -2 2}~)0 1 -2 3 -2 2 
1 4 -7 5 2 8 0 2 -4 7 -2 7 00 0 1 2: 3 
1 2. 43 8 7 1 0 1 0 24 21 
~10 1 -2 0 -8 -7}~]0 1 -2 0 -8 -7 
00 01 2 3 00 0 1 2 3 


Write down the system corresponding to the row canonical form of M and then transfer the free variables to 
the other side to obtain the free-variable form of the solution: 


xı + xXx + 24x; = 21 xı = 21 — x, — 24x, 
X3 — 2x3 — 8x; = —7 and X = —7 + 2x3 + 8x5 
X4+2x5= 3 X4 = 3 — 2x; 
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Here xı, x2, x4 are the pivot variables and x, and x; are the free variables. Recall that the parametric form of 
the solution can be obtained from the free-variable form of the solution by simply setting the free variables 
equal to parameters, say x; = a, x; = b. This process yields 

x, =21 — a — 24b, x, = —7 + 2a + 8b, x3 =a, xy =3-—2b, x; =b 
or u = (21 — a — 24b, —7 + 2a + 8b, a, 3 — 2b, b) 


which is another form of the solution. 


Linear Combinations, Homogeneous Systems 
3.24. Write v as a linear combination of u4, uz, u3, where 
(a) v= (3,10,7) and u; = (1,3, —2), u, = (1,4,2), u3 = (2,8, 1); 
(b) v= (2,7,10) and u, = (1,2,3), u = (1,3,5), u = (1,5,9); 
(c) v= (1,5,4) and u; = (1,3, —2), u = (2,7,—1), u = (1,6,7). 
Find the equivalent system of linear equations by writing v = xu, + yu, + zuz. Alternatively, use the 


augmented matrix M of the equivalent system, where M = [u], uz, u3, v]. (Here u, uz, uz, v are the columns 
of M.) 


a) The vector equation v = xu, + yu, + zu, for the given vectors is as follows: 
q 1 T yU 3 


3 1 1 2 x+y+2z 
10| =x} 3] +y|4| +z|8]| = | 3x+4y + 8z 
7 =2 2 1 —2x + 2y +z 


Form the equivalent system of linear equations by setting corresponding entries equal to each other, and 
then reduce the system to echelon form: 


x+ y+2z= 3 x+ y+2z= 3 x+y+2z=3 
3x + 4y + 8z = 10 or y+2z= 1 or y+2z=1 
—2x+2y+ z= 7 4y + 5z= 13 —3z=9 

The system is in triangular form. Back-substitution yields the unique solution x = 2, y= 7, z = —3. 


Thus, v = 2u, + 7u, — 3u3. 
Alternatively, form the augmented matrix M = [u , u2, u3, v] of the equivalent system, and reduce 
M to echelon form: 


112 3 1 12 3 1 1 2 3 
M= 3 4 8 10;~};0 1 2 I}r~}0 1 2 1 
—22 2 1 7 0 4 5 13 0 0 -3 9 


The last matrix corresponds to a triangular system that has a unique solution. Back-substitution yields 
the solution x = 2, y= 7, z = —3. Thus, v = 2u; + 7u, — 3u3. 


(b) Form the augmented matrix M = [u], u2, u3, v] of the equivalent system, and reduce M to the echelon 


form: 
1 1 1 2 1 1 1 2 1 1 1 2 
M=|2 3 5 7/~]0 1 3 3}~]0 13 3 
3 5 9 10 02 64 0 0 0 -2 
The third row corresponds to the degenerate equation 0x + Oy + 0z = —2, which has no solution. Thus, 


the system also has no solution, and v cannot be written as a linear combination of u), u, u3. 


(c) Form the augmented matrix M = [u], uz, u3, v| of the equivalent system, and reduce M to echelon form: 
1 2 1 1 12 1 1 12 1 1 
M= 3 7 6 S|}r~]O 1 3 2|~]jO 1 3 2 
=2 =- 7 4 03 9 6 000 0 
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The last matrix corresponds to the following system with free variable z: 
x+2y+ z=1 
y+3z=2 


Thus, v can be written as a linear combination of u, u, u; in many ways. For example, let the free 
variable z = 1, and, by back-substitution, we get y = —2 and x = 2. Thus, v = 2u, — 2u, + u3. 


3.25. Let u = (1,2,4), u, = (2, —3, 1), u; = (2,1, -1) in RÌ. Show that u; , u, u; are orthogonal, and 
write v as a linear combination of u}, uz, u3, where (a) v=(7,16,6), (b) v= (3,5,2). 
Take the dot product of pairs of vectors to get 
u u, =2—-64+4=0, uw -u,=24+2-4=0, m-w=4-3-1=0 
Thus, the three vectors in R? are orthogonal, and hence Fourier coefficients can be used. That is, 
v = Xu, + yu, + zuz, where 


Uy UU _ U- 
yu! ae u3 Uy 
(a) We have 
a Pee 208 y poe ee ee 2 z= 14+16-6_24_4 
1+4+16 21 i 4+9+1 14 , 4+1+1 6 
Thus, v = 3u; — 2u, + 4u3. 
(b) We have 
Lot Mas 2] ses ia 1 645-2 9 3 
1+4+16 21 , 4+9+1 14 2’ 4+1+1 6 2 


Thus, v = u — 4u + 3u. 


3.26. Find the dimension and a basis for the general solution W of each of the following homogeneous 


systems: 
2x + 4x, — 5x3 + 3x4 = 0 x—2y—3z=0 
3x, 6X, Tx; + 4x, = 0 2x+ y+3z=0 
5x, + 10x, — 11x; + 6x, = 0 3x — 4y —2z=0 
(a) (b) 


(a) Reduce the system to echelon form using the operations ‘‘Replace L, by —3L, + 2L5,”’ “‘Replace L; by 
—5L, + 2L;,”’ and then ‘‘Replace L, by —2L, + L3.’’ These operations yield 


2x, + 4x — 5x3 + 3x4 = 0 
X3- Xs 0 and 
3x3 a= 3x4 =0 
The system in echelon form has two free variables, x, and x4, so dim W = 2. A basis [u,, uz] for W may 
be obtained as follows: 
(1) Set x, = 1, x, = 0. Back-substitution yields x, = 0, and then x, = —2. Thus, u; = (—2, 1,0,0). 
(2) Set x, = 0, x4 = 1. Back-substitution yields x; = 1, and then x, = 1. Thus, u, = (1,0,1,1). 
(b) Reduce the system to echelon form, obtaining 


2x, + 4x, — 5x3 + 3x4 = 0 
x, — xX,=0 


x—2y—3z=0 x—2y—3z=0 
5y+9z=0 and 5y+9z=0 
2y+7z=0 17z=0 


There are no free variables (the system is in triangular form). Hence, dim W = 0, and W has no basis. 
Specifically, W consists only of the zero solution; that is, W = {0}. 


3.27. Find the dimension and a basis for the general solution W of the following homogeneous system 
using matrix notation: 


xı +2x,+ 3x; — 2x4 + 4x5 =0 
2x, + 4x, 8x3 + x4 + 9x5 = 0 
3x, + 6x, + 13x3 + 4x4 + 14x; = 0 


Show how the basis gives the parametric form of the general solution of the system. 
When a system is homogeneous, we represent the system by its coefficient matrix A rather than by its 
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3.28. 


augmented matrix M, because the last column of the augmented matrix M is a zero column, and it will 
remain a zero column during any row-reduction process. 
Reduce the coefficient matrix A to echelon form, obtaining 


12 3 2 4 123 2 4 
A=|]2 4 8 1 9]~]0 0 2 sel ee 
3 6 13 4 14 00 4 10 2 


(The third row of the second matrix is deleted, because it is a multiple of the second row and will result in a 
zero row.) We can now proceed in one of two ways. 


(a) Write down the corresponding homogeneous system in echelon form: 


xX, + 2x, + 3x3; — 2x4 + 4x; = 0 
2x3 + 5x4 + x5 =0 


The system in echelon form has three free variables, x3, x4, X5, so dim W = 3. A basis [u], uz, u3] for W 
may be obtained as follows: 


(1) Set x, = 1, x4 = 0, x5 = 0. Back-substitution yields x; = 0, and then x, = —2. Thus, 
u; = (—2,1,0,0,0). 
(2) Set x, = 0, x4 = 1, x5 = 0. Back-substitution yields x, = — 3, and then x, = 2. Thus, 
uy = (2,0, —2, 1,0) 
2 era Baers 
(3) Set x, = 0, x4 = 0, x5 = 1. Back-substitution yields x, = — 5, and then x, = — 3. Thus, 


uz = (—3, 0, —},0, 1). 


[One could avoid fractions in the basis by choosing x4 = 2 in (2) and x; = 2 in (3), which yields 
multiples of u, and u;.] The parametric form of the general solution is obtained from the following 
linear combination of the basis vectors using parameters a, b, c: 


b—te, b, c) 


= 19, _ 5 
au, + bu, + cu; = (—2a+ Fb — 3c, a, 7 


5 
2 
(b) Reduce the echelon form of A to row canonical form: 


12 3 -2 4 123 -2 
A~ 5 1] ~ 5 
001 3 3 001 3 


NI NIM 
L 


Write down the corresponding free-variable solution: 


9 5 
x) = —2x, + 3 *4 = 75 
5 1 
X; = — 4 = 7*5 


Using these equations for the pivot variables x, and x3, repeat the above process to obtain a basis [u,, u>, u3] 
for W. That is, set x, = 1, x4 = 0, x; = 0 to get u; set x, = 0, x4 = 1, x5 = 0 to get u; and set x, = 0, 
x4 = 0, x; = 1 to get uz. 


Prove Theorem 3.15. Let vg be a particular solution of AX = B, and let W be the general solution 
of AX = 0. Then U = vo + W = {vo +w: we W} is the general solution of AX = B. 
Let w be a solution of AX = 0. Then 

A(v +w) = Au) +Aw=B4+0=8B 


Thus, the sum vp + w is a solution of AX = B. On the other hand, suppose v is also a solution of AX = B. 
Then 


A(v— vw) =Av—Aty =B-B=0 


Therefore, v — up belongs to W. Because v = vo + (v — vo), we find that any solution of AX = B can be 
obtained by adding a solution of AX = 0 to a solution of AX = B. Thus, the theorem is proved. 
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Elementary Matrices, Applications 
3.29. Let e,,e,,e; denote, respectively, the elementary row operations 
“Interchange rows R; and R,,” “Replace R, by 7R3,” “Replace R, by —3R; + R” 


Find the corresponding three-square elementary matrices £}, E2, E3. Apply each operation to the 3 x 3 identity 
matrix J; to obtain 


010 10 0 1 0 0 
E,=|1 0 0l, E,=|0 1 0l, E,;=|-3 1 0 
001 0 0-7 001 


3.30. Consider the elementary row operations in Problem 3.29. 
(a) Describe the inverse operations e;!, e;!, e;!. 
(b) Find the corresponding three-square elementary matrices E}, E}, E3. 


(c) What is the relationship between the matrices £1, E}, E3 and the matrices E}, Ey, E3? 
(a) The inverses of e,, e), e3 are, respectively, 
“Interchange rows R, and R}, ” “Replace R, by +R,” “Replace R, by 3R, + R2.” 


(b) Apply each inverse operation to the 3 x 3 identity matrix J, to obtain 


0 1 0 1 0 0 1 0 0 

E=]|1 0 0J, E=ļ|0 1 Of, E=]|3 1 0 

00 1 00 4 00 1 

(c) The matrices E|, E}, £4 are, respectively, the inverses of the matrices £4, Ey, E3. 
3.31. Write each of the following matrices as a product of elementary matrices: 

1 3 1 2 3 1 1 2 
(a) A= E J (b) B=|0 1 4}, (c) C= 2 3 8 
0 0 1 —3 -lI 2 


The following three steps write a matrix M as a product of elementary matrices: 


Step 1. Row reduce M to the identity matrix /, keeping track of the elementary row operations. 
Step 2. Write down the inverse row operations. 


Step 3. Write M as the product of the elementary matrices corresponding to the inverse operations. This 
gives the desired result. 


If a zero row appears in Step 1, then M is not row equivalent to the identity matrix 7, and M cannot be 
written as a product of elementary matrices. 


(a) (1) We have 
i =53 i 3 1 -—3 1 0 
TE al~ fo alel i] ~[o Ja 


where the row operations are, respectively, 
“Replace R, by 2R; + Ro,” “Replace R, by — iR, a “Replace R; by 3R, + R,” 
(2) Inverse operations: 


“Replace R, by —2R, + R3,” “Replace R, by —2R,,” “Replace R; by —3R, + R,” 


waa ae a al 
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(b) (1) We have 


1 2 3 1 2 0 1 0 0 
B=;0 1 4/~)]0 1 Of~ jO 1 Of=I7 
0 0 1 0 0 1 0 0 1 
where the row operations are, respectively, 
“Replace R, by — 4R, +R,” “Replace R; by — 3R; + R,” “Replace R} by —2R, + R,” 
(2) Inverse operations: 
“Replace R, by 4R3 + Ry,” “Replace R; by 3R} + R}, ” “Replace R; by 2R, + R,” 
1 0 0||1 0 3||12 0 
3) B=]0 1 44/0 1 0||O 1 0 
0 0 1||0 O 1}}0 0 1 
(c) (1) First row reduce C to echelon form. We have 
1 1 2 1 1 2 1 1 2 
C= 2 3 8|~|{0 1 4|~]]O 1 4 
—3 -1 2 0 2 8 0 0 0 


In echelon form, C has a zero row. “‘STOP.’’ The matrix C cannot be row reduced to the identity 
matrix J, and C cannot be written as a product of elementary matrices. (We note, in particular, that 
C has no inverse.) 


1 2 —4 1 3 —4 
3.32. Find the inverse of (a) A=]-1 -—1 5],(b) B=]1 5 -l1 
2 7 -3 3 13 -6 
(a) Form the matrix M = [4,/] and row reduce M to echelon form: 
1 2 -411 0 0] 12 -4) 10 0 
M=|-1 -1 5,0 PO AIG 1:110 
2 7 3'0 0 1} [0 3 5,-2 01 
1 2 at 1 0 0] 
| 
~|0 1 l; 1 1 0 
0 0 2!-5 -3 1 


| | 
In echelon form, the left half of M is in triangular form; hence, A has an inverse. Further reduce M to 
row canonical form: 


1 2 0—9 —6 2 1 0 0 | —16 —11 3 
walo Ta p gatra a § -3 
oo 1j-$-3 4] [oo 1} -§ -3 4 
The final matrix has the form [I Ao; that is, A~! is the right half of the last matrix. Thus, 
—16 -11 3 
m| } 4 
a —3 1 
2 2 2 
(b) Form the matrix M = [B,/] and row reduce M to echelon form: 
1 3 —41 0 0 1 3 —4, 100 1 3 —4! 1 00 
M=]|1 5 —1,0 1 Of~ JO 2 3,-1 1 Oj~ JO 2 3-1 10 
3 13 —610 0 1 0 4 61-3 0 1 0 0 0,;-l —2 1 


In echelon form, M has a zero row in its left half; that is, B is not row reducible to triangular form. 
Accordingly, B has no inverse. 
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3.33. 


3.34. 


3.35. 


Show that every elementary matrix F is invertible, and its inverse is an elementary matrix. 


Let E be the elementary matrix corresponding to the elementary operation e; that is, e(7) = E. Let e’ be 
the inverse operation of e and let E’ be the corresponding elementary matrix; that is, e’(/) = E’. Then 


T=e(e))=e(LE)=FE and I = e(e'(I)) = e(E') = EE 
Therefore, E’ is the inverse of E. 


Prove Theorem 3.16: Let e be an elementary row operation and let E be the corresponding 
m-square elementary matrix; that is, E = e(/). Then e(A) = EA, where A is any m x n matrix. 
Let R; be the row i of A; we denote this by writing A = [R,...,R,,]. If B is a matrix for which AB is 
defined then AB = [R,B,...,R,,B]. We also let 
é = (0,....,0,1,0,...,0), c= i 


Here “=i means 1 is the ith entry. One can show (Problem 2.45) that e,A = R;. We also note that 
I = |e), €),--+,@,] is the identity matrix. 


(i) Let e be the elementary row operation ‘‘Interchange rows R; and R;.’’ Then, for “= i and _ j, 
E = e(I) = [e,,...,8,---585 en] 
and 
e(A) = [RiR Ria Ral 
Thus, 


EA NA era A bs o, EAr or EnA] = [Ris Ray RaR 


“m T 


= e(A) 


m 
(ii) Let e be the elementary row operation ‘‘Replace R; by kR; (k # 0). Then, for^= i, 
E=e(1) = [ejje y kepen 

and 
e(A) = [Ri, -.., kR, 


TERI 


,R 


al 


Thus, 


EA = [e;,A,...,ke,A,...,e„A] = [R]; .--, KR; <.. , Rm] = e(4) 


3 =m LDR daia 


(iii) Let e be the elementary row operation *‘Replace R, by kR, + R; Then, for°^= i, 


E=e(1)=[e, ..., ke +e, | 
and 
e(A) =[Ry, ..., AR, AR, <., Rel 
Using (ke; + e;)A = k(e,A) + eA = kR, + R;, we have 
EA=[e\A, ..., (kej+e)A, .--, eA] 
= [Rp sy KR FR, a, Rp] = e(4) 


Prove Theorem 3.17: Let A be a square matrix. Then the following are equivalent: 
(a) A is invertible (nonsingular). 
(b) A is row equivalent to the identity matrix 7. 
(c) A is a product of elementary matrices. 
Suppose Á is invertible and suppose A is row equivalent to matrix B in row canonical form. Then there 
exist elementary matrices E£}, E2,...,E, such that E,...E,£,A = B. Because A is invertible and each 


elementary matrix is invertible, B is also invertible. But if B 4 I, then B has a zero row; whence B is not 
invertible. Thus, B = J, and (a) implies (b). 
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3.36. 


3.37. 


3.38. 


If (b) holds, then there exist elementary matrices £,,E,,...,£, such that E,...£,E,A =Z. Hence, 
A=(E,...E)E,) | = E;'Ey!...,E>!. But the Ey! are also elementary matrices. Thus (b) implies (c). 

If (c) holds, then A = E,E,...E,. The E; are invertible matrices; hence, their product A is also 
invertible. Thus, (c) implies (a). Accordingly, the theorem is proved. 


Prove Theorem 3.18: If AB = I, then BA = J, and hence B = A~!. 


Suppose A is not invertible. Then Æ is not row equivalent to the identity matrix /, and so A is row 
equivalent to a matrix with a zero row. In other words, there exist elementary matrices £,,...,£, such 
that E,...£,£,A has a zero row. Hence, E,...£,£,AB = E,...E,E,, an invertible matrix, also has a 
zero row. But invertible matrices cannot have zero rows; hence A is invertible, with inverse A~!. Then 
also, 


B = IB = (4 ABSA“ (4B) =A =a 
Prove Theorem 3.19: B is row equivalent to A (written B ~ A) if and only if there exists a 


nonsingular matrix P such that B = PA. 


If B ~ A, then B = e,(...(e)(e,(A)))...) = E,...£,E,A = PA where P = E, . . . EE} is nonsingular. 
Conversely, suppose B = PA, where P is nonsingular. By Theorem 3.17, P is a product of elementary 
matrices, and so B can be obtained from A by a sequence of elementary row operations; that is, B ~ A. Thus, 
the theorem is proved. 


Prove Theorem 3.21: Every m x n matrix A is equivalent to a unique block matrix of the form 


E o , where Z, is the r x r identity matrix. 


The proof is constructive, in the form of an algorithm. 


Step 1. Row reduce 4 to row canonical form, with leading nonzero entries a4; , ayj,,..., Arj, 
Step 2. Interchange C} and Cy, interchange C, and C};,,..., and interchange C, and C,. This gives a 
ioe I. B $ , ; 
matrix in the form n , with leading nonzero entries a11, 422,- - - , App- 


Step 3. Use column operations, with the a; as pivots, to replace each entry in B with a zero; that is, for 
i=1,2,...,randj=r+1,r+2,...,n, apply the operation —b;C; + C; > C. 


| 
The final matrix has the desired form Fes. 
| 


Lu Factorization 


3.39. 


1 =3 5 l 4 -3 
Find the LU factorization of (a) A= 2 —4 7/,(b) B= 2 8 1 
-1 —2 1 -5 —9 7 


(a) Reduce A to triangular form by the following operations: 


“Replace R, by — 2R; + Ro,” “Replace R, by R; + R3,” and then 
“Replace R} by $R, + R3” 


These operations yield the following, where the triangular form is U: 


1 -3 5 1 -3 5 1 00 
A~|0 2 -3}~]0 2 -3|=U ad L=|] 2 10 
0-5 6 0 o0 -3 a. 


The entries 2, —1, — 3 in L are the negatives of the multipliers —2, 1, 3 in the above row operations. (As 
a check, multiply Z and U to verify A = LU.) 
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3.40. 


3.41. 


3.42. 


(b) Reduce B to triangular form by first applying the operations ‘‘Replace R, by —2R, + R,” and ‘‘Replace 
R, by 5R, + R3.” These operations yield 


1 4 -3 
B~ |0 0 7 
0 11 —8 


Observe that the second diagonal entry is 0. Thus, B cannot be brought into triangular form without row 
interchange operations. Accordingly, B is not LU-factorable. (There does exist a PLU factorization of 
such a matrix B, where P is a permutation matrix, but such a factorization lies beyond the scope of this 
text.) 


Find the LDU factorization of the matrix A in Problem 3.39. 


The A = LDU factorization refers to the situation where L is a lower triangular matrix with 1’s on the 
diagonal (as in the LU factorization of A), D is a diagonal matrix, and U is an upper triangular matrix with 1’s 
on the diagonal. Thus, simply factor out the diagonal entries in the matrix U in the above LU factorization of A 
to obtain D and L. That is, 


1 0 0 1 0 0 1 -3 5 
L= 2 1 0f, D= |0 2 0], U= |0 1 -3 
-1 -3 1 00 -3 0 0 1 
1 2 1 
Find the LU factorization of the matrix A = 2 3 3 
—3 -10 2 
Reduce A to triangular form by the following operations: 
(1) “Replace R, by —2R, + Ro,” (2) “Replace R, by 3R; + R3,” (3) “Replace R; by —4R, + R3” 
These operations yield the following, where the triangular form is U: 
1 2 1 1 2 1 1 0 0 
A~ I0 -1 1]/~)0 -1 1|=U and L= 2 1 0 
0 —4 5 0 0 1 —3 4 1 


The entries 2, —3,4 in L are the negatives of the multipliers —2, 3, —4 in the above row operations. (As a 
check, multiply L and U to verify A = LU.) 


Let A be the matrix in Problem 3.41. Find X,,X,,X3, where X; is the solution of AX = B; for 
(a) By =(1,1,1), (b) B, =B, +X, (c) B; =B, +X. 


(a) Find L~'B, by applying the row operations (1), (2), and then (3) in Problem 3.41 to B4: 


B, = l (1) and (2) ; (3) 
1 4 8 


Solve UX = B for B = (1,—1,8) by back-substitution to obtain X, = (—25,9, 8). 
(b) First find B, = B, +X, = (1,1, 1) + (—25,9, 8) = (—24, 10,9). Then as above 


B, = (-24, 10, 97’ O, [-24, 58, —63]7 —È— [-24, 58, 295)" 


Solve UX = B for B = (—24, 58, —295) by back-substitution to obtain X, = (943, —353, —295). 
(c) First find B; = B, + X, = (—24, 10,9) + (943, —353, —295) = (919, —343, —286). Then, as above 


r OMO, (919, -2181, 2671]? —“— (919, -2181, 11395)" 


B; = [943, —353, —295] 


Solve UX = B for B = (919, —2181, 11395) by back-substitution to obtain 
Xz = (—37 628, 13 576, 11395). 
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Miscellaneous Problems 


3.43. Let L be a linear combination of the m equations in n unknowns in the system (3.2). Say L is the 
equation 


(ciai ere Cnn )X1 Peery (Cian E Cmamn)Xn = cibi a et Se Cmbm (1) 
Show that any solution of the system (3.2) is also a solution of L. 


Let u = (k,,...,4,) be a solution of (3.2). Then 


ayky + apko +-+- + ank, = 0; (i=1,2,...,m) (2) 
Substituting u in the left-hand side of (1) and using (2), we get 
(ciay Parane Cmamı)kı P en aa (Cian PEro Cmamn) Kn 
=c] (aiik oer ainka) pe eee Cin (Qn Ky arrede aAmnkn) 
= cibi es + Cy Dy, 


This is the right-hand side of (1); hence, u is a solution of (1). 


3.44. Suppose a system .@ of linear equations is obtained from a system Z by applying an elementary 
operation (page 64). Show that “M and Z have the same solutions. 


Each equation L in / is a linear combination of equations in Z. Hence, by Problem 3.43, any solution 
of Z will also be a solution of .@. On the other hand, each elementary operation has an inverse elementary 
operation, so ¥ can be obtained from ⁄ by an elementary operation. This means that any solution of // is a 
solution of Z. Thus, Z and .@ have the same solutions. 


3.45. Prove Theorem 3.4: Suppose a system Z of linear equations is obtained from a system Z by a 
sequence of elementary operations. Then æ and Z have the same solutions. 


Each step of the sequence does not change the solution set (Problem 3.44). Thus, the original system # 
and the final system ⁄ (and any system in between) have the same solutions. 


3.46. A system ¥ of linear equations is said to be consistent if no linear combination of its equations is 
a degenerate equation L with a nonzero constant. Show that Z is consistent if and only if Z is 
reducible to echelon form. 


Suppose ¥ is reducible to echelon form. Then ¥ has a solution, which must also be a solution of every 
linear combination of its equations. Thus, L, which has no solution, cannot be a linear combination of the 
equations in X. Thus, Z is consistent. 

On the other hand, suppose ¥ is not reducible to echelon form. Then, in the reduction process, it must 
yield a degenerate equation L with a nonzero constant, which is a linear combination of the equations in Z. 
Therefore, Z is not consistent; that is, Z is inconsistent. 


3.47. Suppose u and v are distinct vectors. Show that, for distinct scalars k, the vectors u + k(u — v) are 
distinct. 


Suppose u + k,(u — v) = u + k (u — v). We need only show that kı = ky. We have 
k(u— v) = k (u — v), and so (ki — ky)(u— v) =0 


Because u and v are distinct, u — v Æ 0. Hence, k, — k, = 0, and so ky = ky. 


3.48. Suppose AB is defined. Prove 


(a) Suppose A has a zero row. Then AB has a zero row. 
(b) Suppose B has a zero column. Then AB has a zero column. 
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(a) Let R, be the zero row of A, and C},.. 


(R;C1, RiCy, ya 


< R;Cp) > (0,0,0,...,0) 


— o 


., C, the columns of B. Then the ith row of AB is 


(b) BT has a zero row, and so BTAT = (AB)" has a zero row. Hence, AB has a zero column. 


SUPPLEMENTARY PROBLEMS 


Linear Equations, 2 x 2 Systems 


3.49. Determine whether each of the following systems is linear: 


(a) 3x -— 4y + 2yz = 8, (b) ex+3y =n, (c) 
3.50. Solve (a) nx = 2, (b) 3x + 2 = 5x + 7 — 2x, (c) 


3.51. Solve each of the following systems: 


2x — 3y + kz = 4 


6x +2 — 4x =5+4 2x 
(c) 2x—4= 3y 
5y-x=5 


(c) x+ay=3 


(a) 2x+3y=1 (b) 4x-2y=5 
5x+7Ty =3 —6x+3y= 1 

3.52. Consider each of the following systems in unknowns x and y: 
(a) x-—ay=1 (b) ax+3y=2 
ax —4y=b 12x + ay = b 


2x + 5y =b 


(d) 2x—4y= 10 


3x— 6y = 15 


For which values of a does each system have a unique solution, and for which pairs of values (a,b) does 


each system have more than one solution? 


General Systems of Linear Equations 
3.53. Solve 


(a) x+ y+ 2z= 4 (b) 
2x+3y+ 6z= 10 
3x + 6y + 10z = 17 


3.54. Solve 
(a) x—-—2y=5 (b) 
2x+3y =3 2x + Sy — 8z+ 6t=5 
3x+2y=7 3x + 4y —5z+2t=4 
3.55. Solve 
(a) 2x— y—4z=2 (b) 


4x — 2y — 6z =5 
6x — 3y — 8z = 8 


2x—-3y+ 8z=7 
3x — 4y + 13z = 8 


x+2y—3z4+2t=2 (c) 


x+2y-— z+3t= 3 
2x+4y+4z+3t= 9 
3x+6y— z+8t= 10 


3.56. Consider each of the following systems in unknowns x, y, z: 


(a) x—2y =1 (b) x+ 2y+2z 
x— y+az=2 x+ ay+3z 
ay + 9z =b x+ lly + az 


=1 (c) 
=3 xtay+ z=4 
=b ax+ y+ z=b 


x+ ytaz=1 


x—2y+ 3z=2 (c) x+2y+ 3z= 3 
2x+3y+ 8z= 4 
5x + 8y + 19z = 11 


x+2y+4z-5t=3 
3x— y+5z+2t=4 
5x — 4y +6z+9%t=2 


For which values of a does the system have a unique solution, and for which pairs of values (a, b) does the 
system have more than one solution? The value of b does not have any effect on whether the system has a 


unique solution. Why? 
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Linear Combinations, Homogeneous Systems 
3.57. Write v as a linear combination of u , u), uz, where 
v = (4,—9,2), u =(1,2,—1), u = (1,4,2), u= (1,—3,2); 
(b) v= (1,3,2) u = (1,2,1), m= (2,6,5) m= (1,7,8); 
(c) v= (1,4,6) u = (1,1,2), u = (2,3,5), u= (3,5,8). 


3.58. Letu; = (1,1,2), u = (1,3, —2), u, = (4, —2, —1) in R°. Show that u, uz, uz are orthogonal, and write v 
as a linear combination of u4, uz, uz, where (a) v = (5, —5,9), (b) v = (1, —3,3), (c) v= (1,1,1). 
(Hint: Use Fourier coefficients.) 


3.59. Find the dimension and a basis of the general solution W of each of the following homogeneous systems: 


(a) x—y+2z=0 (b) x+2y—3z=0 (co) x+2y+ 32+ t=0 
2x+y+ z=0 2x + 5y+2z=0 2x+4y+ 7z+4t=0 
Sx+y+4z=0 3x— y—4z=0 3x + 6y + 10z+5t=0 


3.60. Find the dimension and a basis of the general solution W of each of the following systems: 


(a) x+ 3x, + 2x; —x4— x;=0 (b) 2x, 4x, + 3x3 — x4 + 2x; =0 
2x, + 6x, + 5x; +x4— x5 =0 3x, 6x7 + 5x3 — 2x4 + 4x; =0 
5x, + 15x, + 12x; + x4 — 3x; = 0 5x, — 10x, + 7x3 — 3x4 + 18x; = 0 


Echelon Matrices, Row Canonical Form 


3.61. Reduce each of the following matrices to echelon form and then to row canonical form: 


11 2 1 2 -1 21 242 2 51 
(a) |2 4 9|, ® |2 4 12 5|, @ 13 62 2 04 
15 12 3 6 3 -77 4 82 6 -57 


3.62. Reduce each of the following matrices to echelon form and then to row canonical form: 


1212 1 2 012 3 131 3 
2435 5 7 038 12 2 8 5 10 
w 1364900) © looa 6? © 11 770 
1243 6 9 0 2 7 10 3 11 7 15 


3.63. Using only 0’s and 1’s, list all possible 2 x 2 matrices in row canonical form. 


3.64. Using only 0’s and 1’s, find the number n of possible 3 x 3 matrices in row canonical form. 


Elementary Matrices, Applications 
3.65. Let e,,e),e3, denote, respectively, the following elementary row operations: 


“Interchange R, and R;,” “Replace R, by 3R),” “Replace R; by 2R; + R,” 


(a) Find the corresponding elementary matrices E, E2, E3. 


b) Find the inverse operations e7!, e3!, ey!; their corresponding elementary matrices E), E}, E}; and the 
p 1 2 > 83 p 8 y 2 £3 
relationship between them and £), E3, E3. 


(c) Describe the corresponding elementary column operations fi, f2, f3- 


(d) Find elementary matrices F}, F3, F3 corresponding to fi, f2,f3, and the relationship between them and 
E,, Ey, E3. 
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3.66. Express each of the following matrices as a product of elementary matrices: 


1 2 0 
1 2 3 —6 2 6 
dea aso ake Bele oh Beier 
3 4 —2 4 -3 -7 3 g7 
3.67. Find the inverse of each of the following matrices (if it exists): 

| —2 =l 1 2 3 1 3 =2 2 1 -1 
A=]|2 -3 ls B=|2 6 1], C=);2 8 -3), D=|5 2 -3 
3 -4 4 3 10 -I1 1 7 1 0 2 1 


3.68. Find the inverse of each of the following n x n matrices: 


(a) A has 1’s on the diagonal and superdiagonal (entries directly above the diagonal) and 0’s elsewhere. 
(b) B has 1’s on and above the diagonal, and 0’s below the diagonal. 


Lu Factorization 


3.69. Find the ZU factorization of each of the following matrices: 


1 -1 -1 1 3 -1 23 6 12 3 
(a) 13 -4 2|, l2 5 1|,@ l4 7 9|,@ |2 4 7 
ee ae, 3 4. 2 35 4 3 7 10 


3.70. Let A be the matrix in Problem 3.69(a). Find X; , X2, X3, X4, where 


(a) X; is the solution of AX = B,, where B, = (1,1,1). 
(b) For k > 1, X, is the solution of AX = B,, where B, = By,_, +Xk-1- 


3.71. Let B be the matrix in Problem 3.69(b). Find the LDU factorization of B. 


Miscellaneous Problems 


3.72. Consider the following systems in unknowns x and y: 


(a) ax+by=1 (b) ax+ by=0 
cx+dy=0 cx+dy=1 


Suppose D = ad — bc # 0. Show that each system has the unique solution: 
(a) x= d/D, y= —c/D, (b) x= —b/D, y= a/D. 


3.73. Find the inverse of the row operation ‘‘Replace R, by kR; + k'R; (k’ + 0).” 


3.74. Prove that deleting the last column of an echelon form (respectively, the row canonical form) of an 
augmented matrix M = [A, B] yields an echelon form (respectively, the row canonical form) of A. 


3.75. Lete be an elementary row operation and £E its elementary matrix, and let f be the corresponding elementary 
column operation and F its elementary matrix. Prove 


(a) f(A)=(e(A"))’, (&+) F=E", (©) f(A) =AF. 


3.76. Matrix A is equivalent to matrix B, written A ~ B, if there exist nonsingular matrices P and Q such that 
B= PAQ. Prove that ~ is an equivalence relation; that is, 


(a) AXA, (b) IfA x B,then B xA, (c) IfA xB and B x C, then A 7 C. 
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Notation: A=[R,; R; ...] denotes the matrix A with rows R4, R3, ... . The elements in each row are separated 
by commas (which may be omitted with single digits), the rows are separated by semicolons, and 0 denotes a zero 


row. For example, 


1 2 3 4 
A=([1,2,3,4; 5,-6,7,-8; O0J=|5 -6 7 -8 
0 0 0 0 
3.49. (a) no, (b) yes, (c) linear in x,y,z, not linear in x,y,z, k 
3.50. (a) x=2/z, (b) no solution, (c) every scalar k is a solution 


3.51. (a) (2,-1), (b) no solution, (c) (5,2), (d) (5— 2a, a) 


3.52. (a) a#+42, (2,2) 


3.53. (a) (2, 1,5), (b) no solution, (c) u= (—7a-— 1, 2a +2, 


3.54. (a) (3,—1) (Œ) w=(-a+2b, 1+2a—2b, a, b), (c) 


(-2,-2), (b) a#+46, (6,4), (-6,-4), 


a). 


no solution 


3.55. (a) u=(ja+2, a, 5), (b) u= (}(7-—5b-— 4a), a, $(1+5), b) 


3.56. (a) a#3, (3,3), (=3,—3) (b) a#5anda¥~-l, (5,7), 


(c) a#Alanda#-2, (-2,5) 
3.57. (a) 2,-1,3, (b) 6,—3,1, (c) not possible 
3.58. (a) 3, —2, 1, (b) 2,-1,4, (c) 2 tt 


3.59. (a) dimW=1, wu, =(-l, 


1 J; (b) dim W = 0, no basis, 
(c) dim W =2, wu, = (—2,1,0, 


0), u = (5,0, —2, 1) 


3.60. (a) dim W =3, u =(-3,1,0,0, 
i 1,0,0,0), u = (5,0, —5, —3, 1) 


3.61. (a) [1,0,—} 
(c) [1,2,0,4, — 53% 0,0,1,-5, 8, -3; 0] 


3.62. (a) [I, 2; 0,0,1,0,1,2; 0,0,0,1,2,1; 0], 


1 r i 
(b) [0,1,0,0; 0,0,1,0; 0,0,0,1; 0], (c) [1,0,0,4; 0,1,0,—1; 


3.63. 5: [1,0; 0,1], [1,1; 0,0], [1,0; 0,0], [0,1; 0,0],0 
3.64. 16 


3.65. (a) [1,0,0; 0,0,1; 0,1,0], [1,0,0; 0,3,0; 0,0,1], [1,0,2; 
(b) R= R;; +R) >R;  —2R;, +R, >R; each E; = E;', 

(c) O = C3,3C, > C,20 + C — C, (d) each F; = EF. 
3.66. A= [1,0; 3,1][1,0; 0,—2][1,2; 0,1], B is not invertible, 
—3,1 [1,6 ,0; 0,1], 
00; 013; 001][120; 


3.67. A`! = [—-8, 12, —5; 5, 


,—2, 1], B has no inverse, 
2,1], D- = [8,—3, —1; 


; 0,1,3; 0}, (b) [l,2,0,0,2; 0,0,1,0,5; 0,0,0,1,2], 
5 . 


(©) až, 


1, 5); 


0), Uy = (7,0, —3, 1,0), uz = (3,0,—1,0, 1), 


0,0,1,2; 0] 
0,1,0; 0,0,1], 
010; 001] 
—5,2,1; 10,—4,—1] 


(3,6) 


CHAPTER 3 Systems of Linear Equations 


3.68. 


3.69. 


3.70. 


3.71. 


3.73. 


3.75. 


3.76. 


A`! = [1,—1,1,—1,...; 0,1,—1,1,—1,...; 0,0,1,—1,1,—1,1,...; 


Bo has 1’s on diagonal, —1’s on superdiagonal, and 0’s elsewhere. 


(a) [100; 310; 211][1,-1,-1; 0,-1,1; 0,0,—1], 
(b) [100; 210; 351][1,3,-1; 0,—-1,3; 0,0,—10], 
(c) [100; 210; 3.3, 1][2,3,6; 0,1, —3; 0,0, —4, 
(d) There is no LU decomposition. 


X,=(1,1,-1]', B, =(2,2,0]", X, = [6,4,0], B; =[8,6,0]’, X, = [22,16,—-2]’, 


B, = [30,22,—2]", X, = [86,62,—6]” 


B=(100; 210; 351] diag(1,—1,—10) [1,3,-1; 0,1,3; 0,0,1] 


Replace R, by —kR; + (1/k’)R;. 
©) f(A) = (e(AT))" = (EAT)? = (AT) ET = AF 


(a) A=TJAI. (b) If A= PBO, then B= P-'40-. 
(c) If A= PBO and B = P'CQ', then A = (PP’)C(Q'Q). 


Vector Spaces 


4.1 Introduction 


This chapter introduces the underlying structure of linear algebra, that of a finite-dimensional vector 
space. The definition of a vector space V, whose elements are called vectors, involves an arbitrary field K, 
whose elements are called scalars. The following notation will be used (unless otherwise stated or 
implied): 
V the given vector space 
uU, U, W vectors in V 
K the given number field 
a,b,c, or k scalars in K 
Almost nothing essential is lost if the reader assumes that K is the real field R or the complex field C. 
The reader might suspect that the real line R has ‘‘dimension’’ one, the cartesian plane R? has 
‘dimension’’ two, and the space R? has ‘‘dimension’’ three. This chapter formalizes the notion of 
““dimension,’’ and this definition will agree with the reader’s intuition. 
Throughout this text, we will use the following set notation: 
aca Element a belongs to set A 
a,beA Elements a and b belong to A 
Vx € A For every x in A 
dx eA There exists an x in A 
ACB A is a subset of B 
ANB Intersection of A and B 
AUB Union of A and B 


) Empty set 


4.2 Vector Spaces 


The following defines the notion of a vector space V where K is the field of scalars. 
DEFINITION: Let V be a nonempty set with two operations: 


(i) Vector Addition: This assigns to any u,v € V a sum u + v in V. 
(ii) Scalar Multiplication: This assigns to any u € V, k € K a product ku € V. 


Then V is called a vector space (over the field K) if the following axioms hold for any 
vectors u, v,w E€ V: 


aD 
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[A] (w+ v)+w=ut(vt+w) 
[A2] There is a vector in V, denoted by 0 and called the zero vector, such that, for any 
uc), 


u+O0=0+u=u 
[A3] For each u € V, there is a vector in V, denoted by —u, and called the negative of u, 
such that 
u + (—u) = (—u)+ u = 0. 


[Ag] u+v= v+u. 

[Mi] k(u+ v) = ku + kv, for any scalar k € K. 
[M2] (a+ b)u = au + bu, for any scalars a,b € K. 
[M3] (ab)u = a(bu), for any scalars a,b € K. 
[M4] lu= u, for the unit scalar 1 € K. 


The above axioms naturally split into two sets (as indicated by the labeling of the axioms). The first 
four are concerned only with the additive structure of V and can be summarized by saying V is a 
commutative group under addition. This means 


(a) Any sum vı + v +: +- + Um of vectors requires no parentheses and does not depend on the order of 
the summands. 
(b) The zero vector 0 is unique, and the negative —u of a vector u is unique. 
(c) (Cancellation Law) If u + w = v + w, then u = v. 
Also, subtraction in V is defined by u — v = u + (—v), where —v is the unique negative of v. 
On the other hand, the remaining four axioms are concerned with the ‘‘action’’ of the field K of scalars 


on the vector space V. Using these additional axioms, we prove (Problem 4.2) the following simple 
properties of a vector space. 


THEOREM 4.1: Let V be a vector space over a field K. 
(i) For any scalar k € K and 0€ V, k0 = 0. 
(ii) For 0 € K and any vector u € V, Ou = Q. 
(iii) If ku = 0, where k € K and u € V, then k = 0 or u = 0. 
(iv) For any k € K and any u € V, (—k)u = k(—u) = —ku. 


4.3 Examples of Vector Spaces 


This section lists important examples of vector spaces that will be used throughout the text. 


Space K" 


Let K be an arbitrary field. The notation K” is frequently used to denote the set of all n-tuples of elements 
in K. Here K” is a vector space over K using the following operations: 


(i) Vector Addition: (a;,a3,...,a ) + (b1, b2,...,b,) = (a, + b1, a + bz,..., a, +b,) 
(ii) Scalar Multiplication: k(a,,a,...,@,) = (ka,,kay,...,ka,) 


The zero vector in K” is the n-tuple of zeros, 
0 = (0,0,...,0) 

and the negative of a vector is defined by 
—(@,,@),...,4,) = (—a@, —a,..., —a,) 


Observe that these are the same as the operations defined for R” in Chapter 1. The proof that K” is a 
vector space is identical to the proof of Theorem 1.1, which we now regard as stating that R” with the 
operations defined there is a vector space over R. 
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Polynomial Space P(t) 
Let P(t) denote the set of all polynomials of the form 


pt) =atat+art+--+ae  (s=1,2,...) 
where the coefficients a; belong to a field K. Then P(t) is a vector space over K using the following operations: 


(i) Vector Addition: Here p(t) + q(t) in P(t) is the usual operation of addition of polynomials. 

(ii) Scalar Multiplication: Here kp(t) in P(t) is the usual operation of the product of a scalar k and a 
polynomial p(t). 

The zero polynomial 0 is the zero vector in P(t). 


Polynomial Space P,,(t) 


Let P,,(¢) denote the set of all polynomials p(t) over a field K, where the degree of p(t) is less than or 
equal to n; that is, 


ph=aqtatrar H at 
where s < n. Then P,,(t) is a vector space over K with respect to the usual operations of addition of 
polynomials and of multiplication of a polynomial by a constant (just like the vector space P(t) above). 
We include the zero polynomial 0 as an element of P,,(t), even though its degree is undefined. 
Matrix Space M,,, 


The notation M,,,,,, or simply M, will be used to denote the set of all m x n matrices with entries in a field 
K. Then M,,,,, is a vector space over K with respect to the usual operations of matrix addition and scalar 
multiplication of matrices, as indicated by Theorem 2.1. 


Function Space F(X) 


Let X be a nonempty set and let K be an arbitrary field. Let F(X) denote the set of all functions of X into 
K. [Note that F(X) is nonempty, because X is nonempty.] Then F(X) is a vector space over K with 
respect to the following operations: 


(i) Vector Addition: The sum of two functions f and g in F(X) is the function f + g in F(X) defined by 
(Fte) =O) tee) xex 


(ii) Scalar Multiplication: The product of a scalar k € K and a function f in F(X) is the function kf in 
F(X) defined by 


(Af )(x) =f (x) We eX 
The zero vector in F(X) is the zero function 0, which maps every x € X into the zero element 0 € K; 
O(x)=0 Wrex 
Also, for any function f in F(X), negative of f is the function —f in F(X) defined by 
CNO) = f(x) wex 


Fields and Subfields 


Suppose a field E is an extension of a field K; that is, suppose Æ is a field that contains K as a subfield. 

Then E may be viewed as a vector space over K using the following operations: 

(i) Vector Addition: Here u + v in E is the usual addition in Æ. 

(ii) Scalar Multiplication: Here ku in E, where k € K and u € E, is the usual product of k and u as 
elements of £. 


That is, the eight axioms of a vector space are satisfied by E and its subfield K with respect to the above 
two operations. 
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4.4 Linear Combinations, Spanning Sets 


Let V be a vector space over a field K. A vector v in V is a linear combination of vectors u , u), ... , Um in 
V if there exist scalars a,,da ,...,a,, in K such that 


v= au] auz ae amum 


Alternatively, v is a linear combination of u}, u2, ..., Um if there is a solution to the vector equation 


U = Xu X22 Xm Um 


where x},X,...,X,, are unknown scalars. 


EXAMPLE 4.1 (Linear Combinations in R”) Suppose we want to express v = (3,7, —4) in R? as a linear 
combination of the vectors 


u; = (1,2,3), u, = (2,3,7), uz = (3,5,6) 


We seek scalars x, y, z such that v = xu, + yu, + zuz; that is, 


3 1 2 3 x+2y+3z= 3 
3} =x|2|+y|3]| +zļ|5 or 2x+3y+5z= 7 
—4 3 7 6 3x + 7y + 6z = —4 


(For notational convenience, we have written the vectors in R? as columns, because it is then easier to find the 
equivalent system of linear equations.) Reducing the system to echelon form yields 


x+2y+3z= 3 x+2y+3z= 3 
—y- z= 1 and then —y- z= 1 
y—3z=-13 —4z=-12 


Back-substitution yields the solution x=2, y= —4, z=3.Thus, v= 2u; — 4u + 3u3. 


Remark: Generally speaking, the question of expressing a given vector v in K” as a linear 
combination of vectors u, u), .. . , Um in K” is equivalent to solving a system AX = B of linear equations, 
where v is the column B of constants, and the u’s are the columns of the coefficient matrix A. Such a 
system may have a unique solution (as above), many solutions, or no solution. The last case—no 
solution—means that v cannot be written as a linear combination of the w’s. 


EXAMPLE 4.2 (Linear combinations in P(t)) Suppose we want to express the polynomial v = 37 + 5t — 5 asa 
linear combination of the polynomials 


pi =Ê +2t+1, Dy =2Ê +5t+4, p =Ê +3t+6 
We seek scalars x, y, z such that v = xp, + yp + zp3; that is, 
BÊ +5t—5= xÊ + 26+ 1) + pO? 4 5t+4) zE 43246) (*) 


There are two ways to proceed from here. 


(1) Expand the right-hand side of (*) obtaining: 


Bf + 5t — 5 = xÊ + 2xt +x + Ope + 5yt + 4y fer + 3zt + 6z 
= (x+ 2y+z)P + (2x + 5y + 3z)t + (x + 4y + 6z) 


Set coefficients of the same powers of ¢ equal to each other, and reduce the system to echelon form: 


x+2y+ z= 3 x+2y+ z= 3 x+2y+ z= 3 
2x+5y+3z= 5 or yt z=-l or yt z=-l 
x+4y+ 6z = —5 2y+5z=—-8 3z = -6 
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The system is in triangular form and has a solution. Back-substitution yields the solution x = 3, y = 1, z = —2. 
Thus, 


v = 3p, + P2 — 2p; 


(2) The equation (*) is actually an identity in the variable ¢; that is, the equation holds for any value 
of t. We can obtain three equations in the unknowns x, y, z by setting t equal to any three values. 
For example, 


Set ¢ = 0 in (1) to obtain: x+ 4y+ 6z2=-5 
Set ¢ = 1 in (1) to obtain: 4x + lly+ 10z = 3 
Set t= —1 in (1) to obtain: y+ 4z= -7 


Reducing this system to echelon form and solving by back-substitution again yields the solution x = 3, y = 1, 
z = —2. Thus (again), v = 3p, + po — 2p3. 


Spanning Sets 


Let V be a vector space over K. Vectors u, u, ..., Um in V are said to span V or to form a spanning set of 
V if every v in V is a linear combination of the vectors u,,u ,...,U,,—that is, if there exist scalars 
a,,Qy,...,a,, in K such that 


V = QU + azt + `+: + amum 


The following remarks follow directly from the definition. 


Remark 1: Suppose u, uz,..., Um span V. Then, for any vector w, the set w, uj, uz, ..., Um also 
spans V. 
Remark 2: Suppose w,,u2,...,u,, span V and suppose u, is a linear combination of some of the 


other u’s. Then the u’s without u, also span V. 


Remark 3: Suppose w1,u2,...,u,, span V and suppose one of the w’s is the zero vector. Then the 
u’s without the zero vector also span V. 


EXAMPLE 4.3 Consider the vector space V = R°. 
(a) We claim that the following vectors form a spanning set of R°: 
e = (1,0,0), e, = (0,1,0), e, = (0,0,1) 
Specifically, if v = (a,b,c) is any vector in R°, then 


v = ae, + be, + ce, 


For example, v = (5, —6,2) = —Se, — 6e, + 2e3. 
(b) We claim that the following vectors also form a spanning set of R?: 
w, = (1,1,1), w = (1,1,0), w; = (1,0,0) 
Specifically, if v = (a,b,c) is any vector in R3, then (Problem 4.62) 


v = (a,b,c) = cw, + (b — c)w, + (a — b)w; 
For example, v = (5, —6,2) = 2w, — 8w, + 11w. 
(c) One can show (Problem 3.24) that v = (2,7,8) cannot be written as a linear combination of the vectors 
u; = (1,2,3), uy = (1,3,5), uz = (1,5,9) 


Accordingly, u, uz, uz do not span R. 
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EXAMPLE 4.4 Consider the vector space V = P,,(t) consisting of all polynomials of degree <n. 


(a) Clearly every polynomial in P,,(¢) can be expressed as a linear combination of the n + 1 polynomials 
1 2 3 


ot Poa By serg Of! 
Thus, these powers of t (where 1 = 7°) form a spanning set for P, (t). 
(b) One can also show that, for any scalar c, the following n + 1 powers of t — c, 
1, t=c, (t-o, (t=, «xn, CH" 


(where (t — c)? = 1), also form a spanning set for P,„(¢). 


EXAMPLE 4.5 Consider the vector space M = M, , consisting of all 2 x 2 matrices, and consider the following 
four matrices in M: 


1 0 0 1 0 0 0 0 
Ei _ k T Er = i At Ez = Í h En = i i 


Then clearly any matrix A in M can be written as a linear combination of the four matrices. For example, 


sal T$ | = SEn = 612 + TEn + BE 


Accordingly, the four matrices E41, E12, E21; E22 span M. 


4.5 Subspaces 


This section introduces the important notion of a subspace. 


DEFINITION: Let V be a vector space over a field K and let W be a subset of V. Then W is a subspace 
of V if W is itself a vector space over K with respect to the operations of vector 

addition and scalar multiplication on V. 
The way in which one shows that any set W is a vector space is to show that W satisfies the eight 
axioms of a vector space. However, if W is a subset of a vector space V, then some of the axioms 
automatically hold in W, because they already hold in V. Simple criteria for identifying subspaces follow. 


THEOREM 4.2: Suppose W is a subset of a vector space V. Then W is a subspace of Vif the following 
two conditions hold: 
(a) The zero vector 0 belongs to W. 
(b) For every u,v € W,k € K: (i) The sum u + v € W. (i) The multiple ku € W. 


Property (i) in (b) states that W is closed under vector addition, and property (ii) in (b) states that W is 
closed under scalar multiplication. Both properties may be combined into the following equivalent single 
statement: 


(b') For every u,v € W,a,b € K, the linear combination au + bv € W. 

Now let V be any vector space. Then V automatically contains two subspaces: the set {0} consisting of 
the zero vector alone and the whole space V itself. These are sometimes called the trivial subspaces of V. 
Examples of nontrivial subspaces follow. 

EXAMPLE 4.6 Consider the vector space V = R°. 
(a) Let U consist of all vectors in R? whose entries are equal; that is, 
U = {(a,b,c):a =b = c} 


For example, (1,1,1), (—3, —3, —3), (7,7,7), (—2, —2, —2) are vectors in U. Geometrically, U is the line 
through the origin O and the point (1, 1, 1) as shown in Fig. 4-1(a). Clearly 0 = (0,0,0) belongs to U, because 
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all entries in 0 are equal. Further, suppose u and v are arbitrary vectors in U, say, u = (a,a,a) and v = (b,b, b). 
Then, for any scalar k € R, the following are also vectors in U: 


u+ v= (a+b, a+b, a+b) and ku = (ka, ka, ka) 


Thus, U is a subspace of R°. 


(b) Let W be any plane in R? passing through the origin, as pictured in Fig. 4-1(b). Then 0 = (0, 0, 0) belongs to W, 
because we assumed W passes through, the origin O. Further, suppose u and v are vectors in W. Then u and v 
may be viewed as arrows in the plane W emanating from the origin O, as in Fig. 4-1(b). The sum u + v and any 
multiple ku of u also lie in the plane W. Thus, W is a subspace of R°. 


> > 
y y 
(b) 
Figure 4-1 
EXAMPLE 4.7 
(a) Let V = M, n, the vector space of n x n matrices. Let W, be the subset of all (upper) triangular matrices and let 


W, be the subset of all symmetric matrices. Then W, is a subspace of V, because W, contains the zero matrix 0 
and W, is closed under matrix addition and scalar multiplication; that is, the sum and scalar multiple of such 
triangular matrices are also triangular. Similarly, W, is a subspace of V. 


(b) Let V = P(t), the vector space P(t) of polynomials. Then the space P,,(t) of polynomials of degree at most n 
may be viewed as a subspace of P(t). Let Q(t) be the collection of polynomials with only even powers of t. For 
example, the following are polynomials in Q(t): 


Pp =3+4e-5% and pp =6—7f 498° 432” 


(We assume that any constant k = kt? is an even power of t.) Then Q(t) is a subspace of P(t). 


(c) Let V be the vector space of real-valued functions. Then the collection W, of continuous functions and the 
collection W, of differentiable functions are subspaces of V. 


Intersection of Subspaces 


Let U and W be subspaces of a vector space V. We show that the intersection U N W is also a subspace of 
V. Clearly, 0 € U and 0 € W, because U and W are subspaces; whence 0 € U N W. Now suppose u and v 
belong to the intersection UM W. Then u, v € U and u, v € W. Further, because U and W are subspaces, 
for any scalars a,b € K, 


au + bv € U and au + bve W 


Thus, au + bv € U N W. Therefore, UM W is a subspace of V. 
The above result generalizes as follows. 


THEOREM 4.3: The intersection of any number of subspaces of a vector space V is a subspace of V. 
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Solution Space of a Homogeneous System 


Consider a system AX = B of linear equations in n unknowns. Then every solution u may be viewed as a 
vector in K”. Thus, the solution set of such a system is a subset of K”. Now suppose the system is 
homogeneous; that is, suppose the system has the form 4X = 0. Let W be its solution set. Because 
AO = 0, the zero vector 0 € W. Moreover, suppose u and v belong to W. Then u and v are solutions of 
AX = 0, or, in other words, Au = 0 and Av = 0. Therefore, for any scalars a and b, we have 


A(au + bv) = adu + bAv = a0 + b0 = 0 +0 = 0 


Thus, au + bv belongs to W, because it is a solution of AX = 0. Accordingly, W is a subspace of K”. 
We state the above result formally. 


THEOREM 4.4: The solution set W of a homogeneous system 4X = 0 in n unknowns is a subspace 
of K”. 


We emphasize that the solution set of a nonhomogeneous system 4X = B is not a subspace of K”. In 
fact, the zero vector 0 does not belong to its solution set. 


4.6 Linear Spans, Row Space of a Matrix 


Suppose u, Uz, ...,Up are any vectors in a vector space V. Recall (Section 4.4) that any vector of the 
form aju + azu +: + amum, Where the a; are scalars, is called a linear combination of u), uz, ... , Um- 
The collection of all such linear combinations, denoted by 

span(u), uz, ..-, Um) or span(u;) 
is called the linear span of u, uz, ... , Um- 


Clearly the zero vector 0 belongs to span(u;), because 
0 = Ou, + Ou, +--+ + Oun 
Furthermore, suppose v and v' belong to span(u;), say, 
V = AU, + AU, + `° F AyUy, and u = biu + baus +--+ + bpm 
Then, 
v+ v = (ay + biju, + (ay + baju +++: + (am + bm)tm 
and, for any scalar k € K, 
kv = kayu, + kazu + +- + ka,,u 


m m 


Thus, v + v’ and kv also belong to span(u;). Accordingly, span(u;) is a subspace of V. 

More generally, for any subset S of V, span(S) consists of all linear combinations of vectors in S or, 
when S = ¢, span(S) = {0}. Thus, in particular, S is a spanning set (Section 4.4) of span(S). 

The following theorem, which was partially proved above, holds. 


THEOREM 4.5: Let S be a subset of a vector space V. 


(i) Then span(S) is a subspace of V that contains S. 
(ii) If W is a subspace of V containing S, then span(S) C W. 


Condition (ii) in theorem 4.5 may be interpreted as saying that span(S) is the ‘‘smallest’’ subspace of 
V containing S. 


EXAMPLE 4.8 Consider the vector space V = R°. 


(a) Let u be any nonzero vector in RÌ. Then span(w) consists of all scalar multiples of u. Geometrically, span(z) is 
the line through the origin O and the endpoint of u, as shown in Fig. 4-2(a). 
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(a) (b) 
Figure 4-2 


(b) Let u and v be vectors in R? that are not multiples of each other. Then span(u, v) is the plane through the origin 
O and the endpoints of u and v as shown in Fig. 4-2(b). 


(c) Consider the vectors e; = (1,0, 0), e, = (0, 1,0), e3 = (0,0, 1) in RÌ. Recall [Example 4.1(a)] that every vector 


in R? is a linear combination of e,, e3, e3. That is, e}, e3, e, form a spanning set of R?. Accordingly, 
— R? 
span(e,,@,e3;) = R’. 


Row Space of a Matrix 


Let A = [a,] be an arbitrary m x n matrix over a field K. The rows of A, 


R = (a11, 4125+- -3 a1n)s R, = (a21, a225- < -3 a20); sey Rm = (amis Am2- - Amn) 
may be viewed as vectors in K”; hence, they span a subspace of K” called the row space of A and denoted 
by rowsp(A). That is, 


rowsp(A) = span(R,, Ro,..-,;Rn) 


Analagously, the columns of A may be viewed as vectors in K” called the column space of A and denoted 
by colsp(A). Observe that colsp(A) = rowsp(A’). 

Recall that matrices A and B are row equivalent, written A ~ B, if B can be obtained from A by a 
sequence of elementary row operations. Now suppose M is the matrix obtained by applying one of the 
following elementary row operations on a matrix A: 


(1) Interchange R; and R,, (2) Replace R; by kR;, (3) Replace R; by kR; + R; 


Then each row of M is a row of A or a linear combination of rows of A. Hence, the row space of M is 
contained in the row space of A. On the other hand, we can apply the inverse elementary row operation on 
M to obtain A; hence, the row space of A is contained in the row space of M. Accordingly, A and M have 
the same row space. This will be true each time we apply an elementary row operation. Thus, we have 
proved the following theorem. 


THEOREM 4.6: Row equivalent matrices have the same row space. 


We are now able to prove (Problems 4.45—4.47) basic results on row equivalence (which first 
appeared as Theorems 3.7 and 3.8 in Chapter 3). 


THEOREM 4.7: Suppose 4 = [a,j] and B = [b;] are row equivalent echelon matrices with respective 
pivot entries 


Ayjp1 E and Diko Doky +++ Ost 


TT, Ks 


Then A and B have the same number of nonzero rows—that is, r = s—and their 
pivot entries are in the same positions—that is, ji = kija = ky,...,j,. = kp 


THEOREM 4.8: Suppose 4 and B are row canonical matrices. Then A and B have the same row space 
if and only if they have the same nonzero rows. 
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COROLLARY 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form. 


We apply the above results in the next example. 


EXAMPLE 4.9 Consider the following two sets of vectors in R*: 
u; = (12-13), u, = (2,4,1, —2), uz = (3,6,3, —7) 
w, = (1,2, —4, 11), w, = (2,4, —5, 14) 
Let U = span(u;) and W = span(w;). There are two ways to show that U = W. 


(a) Show that each u; is a linear combination of w, and w,, and show that each w, is a linear combination of u1, u, 
uz. Observe that we have to show that six systems of linear equations are consistent. 


(b) Form the matrix A whose rows are u, up, uz; and row reduce A to row canonical form, and form the matrix B 
whose rows are w; and w, and row reduce B to row canonical form: 


1 2 -1 3 12 -1 3 120 43 

A=|]2 4 1 -2]/~]0 0 3 -8/~1]0 0 —§ 

3 6 3 -7 0 0 6 16 000 0 
gal) 2 =. 12 -4 11 120 4 
~ [2 4 -5 14 00 3 -8 001 -8 


Because the nonzero rows of the matrices in row canonical form are identical, the row spaces of A and B are 
equal. Therefore, U = W. 


Clearly, the method in (b) is more efficient than the method in (a). 


4.7 Linear Dependence and Independence 


Let V be a vector space over a field K. The following defines the notion of linear dependence and 
independence of vectors over K. (One usually suppresses mentioning K when the field is understood.) 
This concept plays an essential role in the theory of linear algebra and in mathematics in general. 


DEFINITION: We say that the vectors v, v),..., U„ in V are linearly dependent if there exist scalars 
a), 4),.--,@,, in K, not all of them 0, such that 
ALU + aU FF An Vig = 0 
Otherwise, we say that the vectors are linearly independent. 


The above definition may be restated as follows. Consider the vector equation 


XU, X20 + +++ +X yt, = 0 (*) 
where the x’s are unknown scalars. This equation always has the zero solution x, =0, 
X3 =0,...,x,, = 0. Suppose this is the only solution; that is, suppose we can show: 

X1 Vy F X20 +++ +X, Vy, = 0 implies x,=0, x» =0, ..., x, =0 
Then the vectors v,, v2,..., U,, are linearly independent, On the other hand, suppose the equation (*) has 
a nonzero solution; then the vectors are linearly dependent. 

A set S = {v1,U2,---, Um} of vectors in V is linearly dependent or independent according to whether 
the vectors v4, v2,..., U,, are linearly dependent or independent. 
An infinite set S of vectors is linearly dependent or independent according to whether there do or do 
not exist vectors vı, V2,..., V% in S that are linearly dependent. 
Warning: The set S = {v,, vy,...,U,,} above represents a /ist or, in other words, a finite sequence 


of vectors where the vectors are ordered and repetition is permitted. 
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The following remarks follow directly from the above definition. 
Remark 1: Suppose 0 is one of the vectors v,, V2,..., Un, Say vı = 0. Then the vectors must be 
linearly dependent, because we have the following linear combination where the coefficient of v; 4 0: 


lv, +0v +--+ +00, =1-0+0+---+0=0 


Remark 2: Suppose v is a nonzero vector. Then v, by itself, is linearly independent, because 
kv =0, v#0 implies k=0 
Remark 3: Suppose two of the vectors v,,v,...,U,, are equal or one is a scalar multiple of the 


other, say v; = kv. Then the vectors must be linearly dependent, because we have the following linear 
combination where the coefficient of v; Æ 0: 


VU) kv, + 0v +--+ 0v, = 0 


Remark 4: Two vectors v, and v, are linearly dependent if and only if one of them is a multiple of 
the other. 


Remark 5: Ifthe set {v;,..., v,,} is linearly independent, then any rearrangement of the vectors 
{U;,;Ui,.+++1U;,} is also linearly independent. 


iy? “ig? 

Remark 6: If a set S of vectors is linearly independent, then any subset of S is linearly 
independent. Alternatively, if S contains a linearly dependent subset, then S is linearly dependent. 
EXAMPLE 4.10 
(a) Let u = (1,1,0), v = (1,3,2), w = (4,9, 5). Then u, v, w are linearly dependent, because 

3u + 5v — 2w = 3(1,1,0) + 5(1,3,2) — 2(4,9,5) = (0,0,0) = 0 


(b) We show that the vectors u = (1,2,3), v = (2,5,7), w = (1,3,5) are linearly independent. We form the vector 
equation xu + yv + zw = 0, where x, y, z are unknown scalars. This yields 


1 2 1 0 x+2y+ z=0 x+2y+ z=0 
x{2]}]+y/5]+4+z/3] = |0 or 2x + 5y+ 3z=0 or y+ z=0 
3 7 5 0 3x+ 7y+5z=0 2z=0 


Back-substitution yields x = 0, y = 0, z = 0. We have shown that 


xu+yu+zw=0 implies x=0, y=0, z=0 


Accordingly, u, v, w are linearly independent. 


(c) Let V be the vector space of functions from R into R. We show that the functions f(t) = sint, g(t) =e’, 
h(t) = Ê are linearly independent. We form the vector (function) equation xf + yg + zh = 0, where x, y, z are 
unknown scalars. This function equation means that, for every value of t, 


xsint + ye’ + zt =0 


Thus, in this equation, we choose appropriate values of ¢ to easily get x = 0, y = 0, z = 0. For example, 


(i) Substitute t = 0 to obtain x(0) + y(1) +2(0) = 0 or y=0 
(ii) Substitute t = x to obtain x(0) + 0(e") + z(n?) = 0 or z=0 
(iii) Substitute t= 2/2 to obtain x(1) + 0(e*/?) + 0(n?/4) = 0 or x=0 


We have shown 


xf +yg+2f =0 implies x=0, y=0, z=0 


Accordingly, u, v, w are linearly independent. 
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Linear Dependence in R? 


Linear dependence in the vector space V = R? can be described geometrically as follows: 


(a) Any two vectors u and v in R? are linearly dependent if and only if they lie on the same line through 
the origin O, as shown in Fig. 4-3(a). 

(b) Any three vectors u, v, w in R? are linearly dependent if and only if they lie on the same plane 
through the origin O, as shown in Fig. 4-3(b). 


Later, we will be able to show that any four or more vectors in R? are automatically linearly dependent. 


(a) u and v are linearly dependent. (b) u, v, and w are linearly dependent. 


Figure 4-3 


Linear Dependence and Linear Combinations 


The notions of linear dependence and linear combinations are closely related. Specifically, for more than 
one vector, we show that the vectors v,, v,..., u,, are linearly dependent if and only if one of them is a 
linear combination of the others. 
Suppose, say, v; is a linear combination of the others, 
Uj = AV Hv FG Vi F ip Vig FF Un 
Then by adding —v; to both sides, we obtain 


QU tee +4, 1 Uy — V H ajg Viga $e + ayy = 9 


where the coefficient of v; is not 0. Hence, the vectors are linearly dependent. Conversely, suppose the 
vectors are linearly dependent, say, 


By ty ee bju; eo By, =O, where b #0 
Then we can solve for v, obtaining 


=i -1 -1 -1 
Uj = b; bivi =i b; Bj 1 by eB = O On Um 
and so v; is a linear combination of the other vectors. 

We now state a slightly stronger statement than the one above. This result has many important 
consequences. 


LEMMA 4.10: Suppose two or more nonzero vectors v4, V2, ..., Up are linearly dependent. Then one 
of the vectors is a linear combination of the preceding vectors; that is, there exists 
k > 1 such that 


Uk = C1 V1 F C22 + ++ H Cki URI 
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Linear Dependence and Echelon Matrices 


Consider the following echelon matrix A, whose pivots have been circled: 


7 


a 
II 
cooooo 


5 
2 

@ 
0 
0 


ONSA 


Observe that the rows R3, R3, R4 have 0’s in the second column below the nonzero pivot in R,, and hence 
any linear combination of R,, R3, R4 must have 0 as its second entry. Thus, R, cannot be a linear 
combination of the rows below it. Similarly, the rows R, and R, have 0’s in the third column below the 
nonzero pivot in R,, and hence R, cannot be a linear combination of the rows below it. Finally, R} cannot 
be a multiple of Ry, because R, has a 0 in the fifth column below the nonzero pivot in R}. Viewing the 
nonzero rows from the bottom up, R4, R3, Ry, R;, no row is a linear combination of the preceding rows. 
Thus, the rows are linearly independent by Lemma 4.10. 

The argument used with the above echelon matrix A can be used for the nonzero rows of any echelon 
matrix. Thus, we have the following very useful result. 


THEOREM 4.11: The nonzero rows of a matrix in echelon form are linearly independent. 


4.8 Basis and Dimension 


First we state two equivalent ways to define a basis of a vector space V. (The equivalence is proved in 
Problem 4.28.) 


DEFINITION A: A set S = {uy,U,...,u,} of vectors is a basis of V if it has the following two 
properties: (1) S is linearly independent. (2) S spans V. 


DEFINITION B: A set S = {uy,u,...,u,} of vectors is a basis of V if every v € V can be written 
uniquely as a linear combination of the basis vectors. 


The following is a fundamental result in linear algebra. 


THEOREM 4.12: Let V be a vector space such that one basis has m elements and another basis has n 
elements. Then m = n. 


A vector space V is said to be of finite dimension n or n-dimensional, written 
dimV =n 


if V has a basis with n elements. Theorem 4.12 tells us that all bases of V have the same number of 
elements, so this definition is well defined. 

The vector space {0} is defined to have dimension 0. 

Suppose a vector space V does not have a finite basis. Then V is said to be of infinite dimension or to 
be infinite-dimensional. 

The above fundamental Theorem 4.12 is a consequence of the following ‘‘replacement lemma’’ 
(proved in Problem 4.35). 


LEMMA 4.13: Suppose {v, v2,...,v,} spans V, and suppose {w,,w>,...,w,,} is linearly indepen- 
dent. Then m <n, and V is spanned by a set of the form 


{WWo5+++5Wms Vio Ups Vz. } 


i , iy } n—m 
Thus, in particular, n + 1 or more vectors in V are linearly dependent. 


Observe in the above lemma that we have replaced m of the vectors in the spanning set of V by the m 
independent vectors and still retained a spanning set. 
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Examples of Bases 


This subsection presents important examples of bases of some of the main vector spaces appearing in this 
text. 


(a) Vector space K”: Consider the following n vectors in K”: 
e, =(1,0,0,0,...,0,0), e, = (0,1,0,0,...,0,0), ..., e, = (0,0,0,0,...,0, 1) 


These vectors are linearly independent. (For example, they form a matrix in echelon form.) 
Furthermore, any vector u = (a,,d),...,a@,) in K” can be written as a linear combination of the 
above vectors. Specifically, 


vV = Ae, + Ae T° F apen 


Accordingly, the vectors form a basis of K” called the usual or standard basis of K”. Thus (as one 
might expect), K” has dimension n. In particular, any other basis of K” has n elements. 


(b) Vector space M = M, , of all r x s matrices: The following six matrices form a basis of the 
vector space M, ; of all 2 x 3 matrices over K: 


1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 
0 0 0P |0 0 0P JO O 0P |10 0P JO 1 OF? JO O 1 
More generally, in the vector space M = M, , of all r x s matrices, let Ej be the matrix with ij-entry 1 


and 0’s elsewhere. Then all such matrices form a basis of M, , called the usual or standard basis of 
M,.,. Accordingly, dim M, , = rs. 


(©) Vector space P,,(t) of all polynomials of degree < n: The set S = {1,¢,7°,0,...,0°} of n+1 
polynomials is a basis of P,,(t). Specifically, any polynomial f(t) of degree <n can be expessed as a 
linear combination of these powers of t, and one can show that these polynomials are linearly 
independent. Therefore, dim P,,(t) =n + 1. 


(d) Vector space P(t) of all polynomials: Consider any finite set S = {f,(¢),(0),....f,(0)} of 
polynomials in P(t), and let m denote the largest of the degrees of the polynomials. Then any 
polynomial g(t) of degree exceeding m cannot be expressed as a linear combination of the elements of 
S. Thus, S cannot be a basis of P(t). This means that the dimension of P(t) is infinite. We note that the 
infinite set S' = {1,t,°,f,...}, consisting of all the powers of t, spans P(r) and is linearly 
independent. Accordingly, S is an infinite basis of P(t). 


Theorems on Bases 
The following three theorems (proved in Problems 4.37, 4.38, and 4.39) will be used frequently. 


THEOREM 4.14: Let V be a vector space of finite dimension n. Then: 


(G) Any n+ l1 or more vectors in V are linearly dependent. 


(ii) Any linearly independent set S = {u], uz, . . . , U„} with n elements is a basis 
of V. 
(iii) Any spanning set T = {v,,v,,...,v,} of V with n elements is a basis of V. 


THEOREM 4.15: Suppose S spans a vector space V. Then: 


(i) Any maximum number of linearly independent vectors in S form a basis of V. 


(ii) Suppose one deletes from S every vector that is a linear combination of 
preceding vectors in S. Then the remaining vectors form a basis of V. 
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THEOREM 4.16: Let V be a vector space of finite dimension and let S = {u,,u,...,u,} be a set of 
linearly independent vectors in V. Then S is part of a basis of V; that is, S may be 
extended to a basis of V. 


EXAMPLE 4.11 
(a) The following four vectors in R* form a matrix in echelon form: 
(1,1,1,1), (0,1,1,1), (0,0,1,1), (0,0,0,1) 
Thus, the vectors are linearly independent, and, because dim R! = 4, the four vectors form a basis of R4. 


(b) The following n + 1 polynomials in P,,(¢) are of increasing degree: 
(t=, GST, ..., (t= 1)" 


Therefore, no polynomial is a linear combination of preceding polynomials; hence, the polynomials are linear 
independent. Furthermore, they form a basis of P,,(t), because dim P, (£) =n + 1. 


(c) Consider any four vectors in R, say 
(257, —132, 58), (43,0, —17), (521, —317, 94), (328, —512, —731) 


By Theorem 4.14(i), the four vectors must be linearly dependent, because they come from the three-dimensional 
vector space R°. 


Dimension and Subspaces 

The following theorem (proved in Problem 4.40) gives the basic relationship between the dimension of a 

vector space and the dimension of a subspace. 

THEOREM 4.17: Let W be a subspace of an n-dimensional vector space V. Then dim W < n. In 
particular, if dim W =n, then W = JV. 


EXAMPLE 4.12 Let W be a subspace of the real space R*. Note that dim R? = 3. Theorem 4.17 tells us that the 
dimension of W can only be 0, 1, 2, or 3. The following cases apply: 


(a) If dim W = 0, then W = {0}, a point. 

(b) If dim W = 1, then W is a line through the origin 0. 
(c) If dim W = 2, then W is a plane through the origin 0. 
(d) If dim W = 3, then W is the entire space R°. 


4.9 Application to Matrices, Rank of a Matrix 


Let A be any m x n matrix over a field K. Recall that the rows of A may be viewed as vectors in K” and 
that the row space of A, written rowsp(A), is the subspace of K” spanned by the rows of A. The following 
definition applies. 


DEFINITION: The rank of a matrix A, written rank(A), is equal to the maximum number of linearly 
independent rows of A or, equivalently, the dimension of the row space of A. 


Recall, on the other hand, that the columns of an m x n matrix A may be viewed as vectors in K” and 
that the column space of A, written colsp(A), is the subspace of K” spanned by the columns of A. 
Although m may not be equal to n—that is, the rows and columns of A may belong to different vector 
spaces—we have the following fundamental result. 


THEOREM 4.18: The maximum number of linearly independent rows of any matrix A is equal to the 
maximum number of linearly independent columns of A. Thus, the dimension of the 
row space of A is equal to the dimension of the column space of A. 


Accordingly, one could restate the above definition of the rank of A using columns instead of rows. 
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Basis-Finding Problems 


This subsection shows how an echelon form of any matrix A gives us the solution to certain problems 
about A itself. Specifically, let A and B be the following matrices, where the echelon matrix B (whose 
pivots are circled) is an echelon form of A: 


2 1 3. 1 2 Q02 1312 
25 5 64 5 003 121 
A=|3 7 6 11 6 9 and B=|0 00012 
1 5 10 8 9 9 000 0 0 0 
2 6 8 11 9 12 000 0 0 0 
We solve the following four problems about the matrix A, where C,, Cy,..., Cg denote its columns: 
(a) Find a basis of the row space of A. 
(b) Find each column C, of A that is a linear combination of preceding columns of A. 
(c) Find a basis of the column space of A. 
(d) Find the rank of A. 


(a) 


(b) 


(d) 


We are given that A and B are row equivalent, so they have the same row space. Moreover, B is in 
echelon form, so its nonzero rows are linearly independent and hence form a basis of the row space 
of B. Thus, they also form a basis of the row space of A. That is, 


basis of rowsp(A): (1,2,1,3,1,2), (0,1,3,1,2,1), (0,0,0, 1,1,2) 


Let M, = [C,,Co,..., C;], the submatrix of A consisting of the first k columns of A. Then M,_; and 
M, are, respectively, the coefficient matrix and augmented matrix of the vector equation 


xiC + XQCy +++ + Xk- Ck-1 = Cy 


Theorem 3.9 tells us that the system has a solution, or, equivalently, C, is a linear combination of 
the preceding columns of 4 if and only if rank(M,) = rank(M,_,), where rank(M,) means the 
number of pivots in an echelon form of M,. Now the first k column of the echelon matrix B is also 
an echelon form of M,. Accordingly, 


rank(M,) = rank(M;) = 2 and = rank(M,) = rank(M,;) = rank(M,) = 3 


Thus, C3, C5, Cę are each a linear combination of the preceding columns of A. 


The fact that the remaining columns C}, Cy, C4 are not linear combinations of their respective 
preceding columns also tells us that they are linearly independent. Thus, they form a basis of the 
column space of A. That is, 


basis of colsp(A): [1,2,3,1,2]7, [2,5,7,5,6]7, [3,6,11,8,11]” 
Observe that C,, C,, C4 may also be characterized as those columns of A that contain the pivots in 


any echelon form of A. 
Here we see that three possible definitions of the rank of A yield the same value. 


(i) There are three pivots in B, which is an echelon form of A. 


(ii) The three pivots in B correspond to the nonzero rows of B, which form a basis of the row 
space of A. 


(iii) The three pivots in B correspond to the columns of A, which form a basis of the column space 
of A. 


Thus, rank(A) = 3. 
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Application to Finding a Basis for W = span(u4, u2, ... ,u,) 


Frequently, we are given a list S = {u,,u,...,u,} of vectors in K” and we want to find a basis for the 
subspace W of K” spanned by the given vectors—that is, a basis of 


W = span(S) = span(w,,u,...,u,) 

The following two algorithms, which are essentially described in the above subsection, find such a basis 
(and hence the dimension) of W. 
Algorithm 4.1 (Row space algorithm) 
Step 1. Form the matrix M whose rows are the given vectors. 
Step 2. Row reduce M to echelon form. 
Step 3. Output the nonzero rows of the echelon matrix. 

Sometimes we want to find a basis that only comes from the original given vectors. The next algorithm 
accomplishes this task. 
Algorithm 4.2 (Casting-out algorithm) 
Step 1. Form the matrix M whose columns are the given vectors. 
Step 2. Row reduce M to echelon form. 


Step 3. For each column C, in the echelon matrix without a pivot, delete (cast out) the vector u, from 
the list S of given vectors. 


Step 4. Output the remaining vectors in S (which correspond to columns with pivots). 
We emphasize that in the first algorithm we form a matrix whose rows are the given vectors, whereas 
in the second algorithm we form a matrix whose columns are the given vectors. 
EXAMPLE 4.13 Let W be the subspace of R° spanned by the following vectors: 
ui = (1,2,153,2); u, = (1,3,3,5,3), uz = (3,8,7, 13,8) 
u4 = (1,4,6,9,7), us = (5,13, 13,25, 19) 


Find a basis of W consisting of the original given vectors, and find dim W. 
Form the matrix M whose columns are the given vectors, and reduce M to echelon form: 


1 1 3 1 5 1131 5 
2 3 8 4 13 0 12 2 3 
M= {13 7 6 13}~]0 0 O 1 2 
3 5 13 9 25 000 0 0 
2 3 8 7 19 000 0 0 


The pivots in the echelon matrix appear in columns C}, C}, Cy. Accordingly, we ‘‘cast out’ the vectors u3 and us 
from the original five vectors. The remaining vectors u, uy, u4, which correspond to the columns in the echelon 
matrix with pivots, form a basis of W. Thus, in particular, dim W = 3. 


Remark: The justification of the casting-out algorithm is essentially described above, but we repeat 
it again here for emphasis. The fact that column C; in the echelon matrix in Example 4.13 does not have a 
pivot means that the vector equation 


XU, + yu, = U3 


has a solution, and hence u, is a linear combination of u, and uv. Similarly, the fact that C; does not have 
a pivot means that u; is a linear combination of the preceding vectors. We have deleted each vector in the 
original spanning set that is a linear combination of preceding vectors. Thus, the remaining vectors are 
linearly independent and form a basis of W. 
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Application to Homogeneous Systems of Linear Equations 


Consider again a homogeneous system AX = 0 of linear equations over K with n unknowns. By 
Theorem 4.4, the solution set W of such a system is a subspace of K”, and hence W has a dimension. 
The following theorem, whose proof is postponed until Chapter 5, holds. 


THEOREM 4.19: The dimension of the solution space W of a homogeneous system AX = 0 is n — r, 
where n is the number of unknowns and r is the rank of the coefficient matrix A. 


In the case where the system AX = 0 is in echelon form, it has precisely n — r free variables, say 
Xi Mire Xi, . Let v, be the solution obtained by setting x; = 1 (or any nonzero constant) and the 
remaining free variables equal to 0. We show (Problem 4. 50) that the solutions v,,v2,...,U,_, are 
linearly independent; hence, they form a basis of the solution space W. 

We have already used the above process to find a basis of the solution space W of a homogeneous 


system AX = 0 in Section 3.11. Problem 4.48 gives three other examples. 


4.10 Sums and Direct Sums 


Let U and W be subsets of a vector space V. The sum of U and W, written U + W, consists of all sums 
u +w where u € U and w € W. That is, 


U +W = {v: v= u+ w, where u € U andw € W} 


Now suppose U and W are subspaces of V. Then one can easily show (Problem 4.53) that U + W is a 
subspace of V. Recall that U N W is also a subspace of V. The following theorem (proved in Problem 
4.58) relates the dimensions of these subspaces. 


THEOREM 4.20: Suppose U and W are finite-dimensional subspaces of a vector space V. Then 
U +W has finite dimension and 


dim(U + W) = dim U + dim W — dim(U N W) 


EXAMPLE 4.14 Let V = M,,, the vector space of 2 x 2 matrices. Let U consist of those matrices whose second 
row is zero, and let W consist of those matrices whose second column is zero. Then 


2 (Ca (ec (ca a 


That is, U + W consists of those matrices whose lower right entry is 0, and U N W consists of those matrices 
whose second row and second column are zero. Note that dim U = 2, dim W =2, dim(U N W) = 1. Also, 
dim(U + W) = 3, which is expected from Theorem 4.20. That is, 


dim(U + W) = dim U + dim V — dim(UN W) =24+2-1=3 

Direct Sums 

The vector space V is said to be the direct sum of its subspaces U and W, denoted by 
V=UOW 


if every v € V can be written in one and only one way as v = u + w where u € U and w E€ W. 
The following theorem (proved in Problem 4.59) characterizes such a decomposition. 


THEOREM 4.21: The vector space V is the direct sum of its subspaces U and W if and only if: 
(i) V=U+W, Gi) UNW = {0}. 
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EXAMPLE 4.15 Consider the vector space V = R°. 
(a) Let U be the xy-plane and let W be the yz-plane; that is, 
U = {(a,b,0) : a,b E R} and W = {(0,b,c): b,c € R} 


Then R? = U+ W, because every vector in R? is the sum of a vector in U and a vector in W. However, R? is not 
the direct sum of U and W, because such sums are not unique. For example, 


(3,5,7) = (3,1,0) + (0,4,7)  andalso (3,5,7) = (3, —4,0) + (0,9,7) 


(b) Let U be the xy-plane and let W be the z-axis; that is, 
U = {(a,b,0) :a,b E€ R} and W = {(0,0,c):c € R} 


Now any vector (a,b,c) € R? can be written as the sum of a vector in U and a vector in V in one and only one 
way: 


(a, b,c) = (a,b, 0) + (0,0,c) 
Accordingly, R? is the direct sum of U and W; that is, R? = U 9 W. 


General Direct Sums 


The notion of a direct sum is extended to more than one factor in the obvious way. That is, V is the direct 
sum of subspaces W,, W>,...,W,, written 


V=W,O8W,8--- OW, 
if every vector v € V can be written in one and only one way as 
v= w +w +: +w, 
where w; € Wi, w E Wp,...,w, E W,. 
The following theorems hold. 
THEOREM 4.22: Suppose V = W, D W, Ð---® W.. Also, for each k, suppose S, is a linearly 
independent subset of W,. Then 
(a) The union S = |J, S; is linearly independent in V. 
(b) Ifeach S, is a basis of W,, then |J, S; is a basis of V. 
(c) dim V = dim W, + dim W, +---+ dim W,. 
THEOREM 4.23: Suppose V = W, + W,+---+W, and dim V = $`, dim W,. Then 
V= W 8m9- W. 


4.11 Coordinates 


Let V be an n-dimensional vector space over K with basis S = {u,,u,...,u,}. Then any vector v € V 
can be expressed uniquely as a linear combination of the basis vectors in S, say 


v= aju + aih + ++: + apy 


These n scalars a; , a3, . . . ,a„ are called the coordinates of v relative to the basis S, and they form a vector 
[a,,@),...,a,] in K” called the coordinate vector of v relative to S. We denote this vector by [v],, or 
simply [v], when S is understood. Thus, 


[ule = [Opty on , | 


For notational convenience, brackets |. . .], rather than parentheses (...), are used to denote the coordinate 
vector. 
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Remark: The above n scalars aj,d,...,a, also form the coordinate column vector 
eee el of v relative to S. The choice of the column vector rather than the row vector to 
represent v depends on the context in which it is used. The use of such column vectors will become clear 
later in Chapter 6. 


EXAMPLE 4.16 Consider the vector space P(t) of polynomials of degree <2. The polynomials 
pp =ttl, pp =e 1, p=(t-1}=f-2t+1 


form a basis S of P,(t). The coordinate vector [v] of v = 2? — 5t + 9 relative to S is obtained as follows. 
Set v = xp; + yp + zp; using unknown scalars x, y, z, and simplify: 


2f = 5t+9=x(t+1)+y(t—1)+z(—2t+1) 


= xt +x + yt — y + zť — 2zt +z 


=z + (x+y-—2zt+(x-y+z2) 
Then set the coefficients of the same powers of t equal to each other to obtain the system 
z= 2; x+y -— 2z = —5, x-y+z=9 
The solution of the system is x = 3, y = —4, z = 2. Thus, 
v = 3p, — 4p) + 2p3, and hence, [v] = [3, —4, 2] 
EXAMPLE 4.17 Consider real space R°. The following vectors form a basis S of R°: 
u; = (1,—1,0), u, = (1,1,0), u, = (0,1,1) 


The coordinates of v = (5,3,4) relative to the basis S are obtained as follows. 
Set v = xv, + yv, +203; that is, set v as a linear combination of the basis vectors using unknown scalars x, y, z. 
This yields 


3 1 
3| =x|—1| +y|1| +z 
4 0 0 1 


The equivalent system of linear equations is as follows: 
x+y=5, =x +y +z =23, z=4 
The solution of the system is x = 3, y = 2, z = 4. Thus, 
v = 3u; + 2u, + 4u, and so [u], = [3,2,4] 
Remark 1: There is a geometrical interpretation of the coordinates of a vector v relative to a basis 
S for the real space R”, which we illustrate using the basis S of R? in Example 4.17. First consider the 


space R? with the usual x, y, z axes. Then the basis vectors determine a new coordinate system of R3, say 
with x’, y', z’ axes, as shown in Fig. 4-4. That is, 


(1) The x’-axis is in the direction of u, with unit length ||, ||. 
(2) The y'-axis is in the direction of u, with unit length |u|]. 
(3) The z-axis is in the direction of u, with unit length ||u3||. 


Then each vector v = (a,b,c) or, equivalently, the point P(a, b,c) in R? will have new coordinates with 
respect to the new x’, y’, z’ axes. These new coordinates are precisely [v],, the coordinates of v with 
respect to the basis S. Thus, as shown in Example 4.17, the coordinates of the point P(5,3,4) with the 
new axes form the vector [3, 2, 4]. 


Remark 2: Consider the usual basis E = {e,,¢),...,e,} of K” defined by 
e,=(1,0,0,...,0,0), e, =(0,1,0,...,0,0), ..., e=(0,0,0,...,0,1) 
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v = (5, 3, 4) = [3, 2, 4] 


Figure 4-4 
Let v = (a),a,...,a,) be any vector in K”. Then one can easily show that 
v = 41€; + a383 + `+ H apen, and so [v]; = [a]; a2, . -a,l 


That is, the coordinate vector [v]; of any vector v relative to the usual basis E of K” is identical to the 
original vector v. 


Isomorphism of V and K” 


Let V be a vector space of dimension n over K, and suppose S = {u,,u,...,u,} is a basis of V. Then 
each vector v € V corresponds to a unique n-tuple [v], in K”. On the other hand, each n-tuple 
[c1,C2,---,¢€,] in K” corresponds to a unique vector Cju + Cau +---+c,u, in V. Thus, the basis S 
induces a one-to-one correspondence between V and K”. Furthermore, suppose 


V = AU, + Agua + +++ + Apun and w = buy, + byt) +--+ + bu 


Then 


where k is a scalar. Accordingly, 
[v+ Why =la, + b1, =., ap + By] = [ais s+ +s 4a] + [bis ba] = lels + [ls 
[ku]s = [ka,, kay, wigs kay] = kay, a, aor aa = kluls 


Thus, the above one-to-one correspondence between V and K” preserves the vector space operations of 
vector addition and scalar multiplication. We then say that V and K” are isomorphic, written 


V & K" 


We state this result formally. 
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THEOREM 4.24: Let V be an n-dimensional vector space over a field K. Then V and K” are 
isomorphic. 
The next example gives a practical application of the above result. 


EXAMPLE 4.18 Suppose we want to determine whether or not the following matrices in V = M, ; are linearly 
dependent: 


1 2 -3 1 3 —4 3 8 —-ll 
k=l 0 | pai 5 i Cah 10 | 
The coordinate vectors of the matrices in the usual basis of M, , are as follows: 
[A] = [1,2, —3,4,0, 1], [B] = [1,3, —4, 6, 5,4], [C] = [3, 8, —11, 16, 10, 9] 
Form the matrix M whose rows are the above coordinate vectors and reduce M to an echelon form: 
1 2 —3 4 0 1 12 -3 4 0 1 12 -3 4 0 1 
M=)1 3 —4 6 5 4;~]0 1 —-1 2 5 3;~/0 1 -1 2 5 3 
3 8 —11 16 10 9 0 2 -2 4 10 6 0 0 0 0 0 0 


Because the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B], [C] span a subspace of 
dimension 2 and so are linearly dependent. Accordingly, the original matrices A, B, C are linearly dependent. 


SOLVED PROBLEMS 


Vector Spaces, Linear Combinations 
4.1. Suppose u and v belong to a vector space V. Simplify each of the following expressions: 


(a) E; =3(2u—4v)+5u+7v, (c) E, = 2uv+3(2u+ 4v) 


3 
(b) E, = 3u -— 6(3u — 5v)+ 7u, (d) Ey =S5u- = Su 


Multiply out and collect terms: 
(a) E; = 6u — 12v + 5u + 7v = llu — 5v 
(b) E, = 3u — 18u + 30v + 7u = —8u + 30v 
(c) £; is not defined because the product uv of vectors is not defined. 


(d) ŒE, is not defined because division by a vector is not defined. 


4.2. Prove Theorem 4.1: Let V be a vector space over a field K. 
(i) k0 = 0. (ii) 0u = 0. (iii) If ku = 0, then k = 0 or u = 0. (iv) (—k)u = k(—u) = —ku. 


(i) By Axiom [A2] with u = 0, we have 0+ 0 = 0. Hence, by Axiom [M;], we have 
k0 = k(0 + 0) = k0 + k0 
Adding —k0 to both sides gives the desired result. 
(ii) For scalars, 0 + 0 = 0. Hence, by Axiom [M3], we have 
Ou = (0+ 0)u = Ou + Ou 
Adding —0u to both sides gives the desired result. 
(iii) Suppose ku = 0 and k Æ 0. Then there exists a scalar k~! such that k~'k = 1. Thus, 
u = lu = (k`'k)u = k`! (ku) =e 0 = 0 


(iv) Using u+ (~u) = 0 and k + (—k) = 0 yields 
0 = k0 = k|u + (—u)]| = ku + k(—u) and 0 = 0u = |k + (—k)ļu = ku + (—k)u 


Adding —ku to both sides of the first equation gives —ku = k(—u), and adding —ku to both sides of the 
second equation gives —ku = (—k)u. Thus, (—k)u = k(—u) = —ku. 
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4.3. 


4.4. 


4.5. 


4.6. 


Show that (a) k(u — v) = ku — kv, (b) u + u = 2u. 
(a) Using the definition of subtraction, that u — v = u + (—v), and Theorem 4.1 (iv), that k(—v) = —kv, we 
have 


k(u — v) = k[u + (—v)| = ku + k(—v) = ku + (—kv) = ku — kv 


(b) Using Axiom [M4] and then Axiom [M3], we have 
u +u = lu+ lu = (1 + l)u = 2u 


Express v = (1, —2, 5) in R? as a linear combination of the vectors 
u; = (1,1,1), u, = (1,2,3), uz = (2, —1, 1) 


We seek scalars x, y, z, as yet unknown, such that v = xu, + yu, + zuz. Thus, we require 


1 1 1 2 x+ y+2z= 1 
—2 | =x} 1] 4+y}2]+2z]-1 or x+2y- z=-2 
5 1 3 1 x+3y+ z= 5 


(For notational convenience, we write the vectors in R? as columns, because it is then easier to find the 
equivalent system of linear equations.) Reducing the system to echelon form yields the triangular system 


x+y+2z=1, y—3z=-3, 5z = 10 


The system is consistent and has a solution. Solving by back-substitution yields the solution x = —6, y = 3, 
z= 2. Thus, v = —6u, + 3u, + 2u3. 

Alternatively, write down the augmented matrix M of the equivalent system of linear equations, where 
Uy, Uz, uz are the first three columns of M and v is the last column, and then reduce M to echelon form: 


1 1 2 1 1 1 2 1 1 1 2 1 
M=j1 2 -1 -2}~1]0 1 -3 -3]}~}]0 1 -3 -3 
1 3 1 3 0 2 -1I 4 0 0 5 10 


The last matrix corresponds to a triangular system, which has a solution. Solving the triangular system by 
back-substitution yields the solution x = —6, y = 3, z = 2. Thus, v = —6u, + 3u, + 2u3. 


Express v = (2, —5,3) in R? as a linear combination of the vectors 
u = (1, —3,2), u = (2, —4, —1), uz = (1, —5,7) 


We seek scalars x, y, z, as yet unknown, such that v = xu, + yu, + zuz. Thus, we require 


2 1 2 1 x+2y+ z= 2 
—5} =x| —3 | +y|—4 | +2] -5 or 3x — 4y — 5z = —5 
3 2 —1 7 2x- y+t7z= 3 


Reducing the system to echelon form yields the system 
x+2y+z=2, 2y—2z=1, 0=3 


The system is inconsistent and so has no solution. Thus, v cannot be written as a linear combination of 
Uy, Ug, U3. 


Express the polynomial v = ¢ + 4t — 3 in P(t) as a linear combination of the polynomials 
pi=ťf—2t+5, p =2P — 3t, P3=tt+1 


Set v as a linear combination of p,, p2, p3 using unknowns x, y, z to obtain 


P+4t—3=x(P —2¢+5)+y(2P — 3t) +2(¢ +1) (*) 


We can proceed in two ways. 
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4.7. 


Method 1. Expand the right side of (*) and express it in terms of powers of ¢ as follows: 
P+4t-3=x° 2xt + 5x + 2y? 3yt +zt+z 
= (x + 2y)P + (—2x — 3y + z)t + (5x + 32) 


Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form. This 
yields 


x+2y= 1 x+2y= 1 x+2y= 1 
—2x—3y+ z= 4 or y+ z= 6 or y+ z= 6 
5x+3z= —3 —10y + 3z = —8 13z = 52 
The system is consistent and has a solution. Solving by back-substitution yields the solution x = —3, y = 2, 


z= 4. Thus, v = —3p, + 2p) + 4p. 
Method 2. The equation (*) is an identity in ¢; that is, the equation holds for any value of t. Thus, we can 
set ¢ equal to any numbers to obtain equations in the unknowns. 

(a) Set t= 0 in (*) to obtain the equation —3 = 5x +z. 

(b) Set ¢= 1 in (*) to obtain the equation 2 = 4x — y+ 2z. 

(c) Set f= —1 in (*) to obtain the equation —6 = 8x + 5y. 


Solve the system of the three equations to again obtain the solution x = —3, y=2, z=4. Thus, 
v = —3p, + 2p + 4p3. 


Express M as a linear combination of the matrices A, B, C, where 
4 7 1 1 1 2 1 1 
ed ES = ae a al P) 
Set M as a linear combination of A, B, C using unknown scalars x, y, z; that is, set M = xd + yB + zC. 


This yields 
4 7| |1 1 1 2 1 1| | x+y+z x+2yt+z 
k de i+ alli Ta ne 


Form the equivalent system of equations by setting corresponding entries equal to each other: 


x+y+z=4, x+2y+z=7, x+3y+4z2=7, x+4y+5z=9 
Reducing the system to echelon form yields 
x+y+z=4, y=3, 3z = -3, 4z = —4 


The last equation drops out. Solving the system by back-substitution yields z = —1, y= 3, x = 2. Thus, 
M=2A+3B—-C. 


Subspaces 


4.8. 


4.9. 


Prove Theorem 4.2: W is a subspace of V if the following two conditions hold: 
(a) 0€ W. (b) Ifu,v € W, then u+ v, ku € W. 


By (a), W is nonempty, and, by (b), the operations of vector addition and scalar multiplication are well 
defined for W. Axioms [A;], [Aq], [Mj], [M2], [M3], [M4] hold in W because the vectors in W belong to V. 
Thus, we need only show that [A>] and [A3] also hold in W. Now [A3] holds because the zero vector in V 
belongs to W by (a). Finally, if v € W, then (—1)v = —v € W, and v + (—v) = 0. Thus [A3] holds. 


Let V = R°. Show that W is not a subspace of V, where 
(a) W = {(a,b,c) :a > 0}, (b) W = { (a,b,c) :a +b +e <1}. 


In each case, show that Theorem 4.2 does not hold. 
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4.10. 


4.11. 


4.12. 


(a) W consists of those vectors whose first entry is nonnegative. Thus, v = (1,2,3) belongs to W. Let 
k = —3. Then kv = (—3, —6, —9) does not belong to W, because —3 is negative. Thus, W is not a 
subspace of V. 


(b) W consists of vectors whose length does not exceed 1. Hence, u = (1,0,0) and v = (0, 1,0) belong to 
W, but u+ v= (1,1,0) does not belong to W, because 1? + 1° +0? = 2 > 1. Thus, W is not a 
subspace of V. 


Let V = P(t), the vector space of real polynomials. Determine whether or not W is a subspace of 
V, where 


(a) W consists of all polynomials with integral coefficients. 

(b) W consists of all polynomials with degree >6 and the zero polynomial. 

(c) W consists of all polynomials with only even powers of t. 

(a) No, because scalar multiples of polynomials in W do not always belong to W. For example, 
f()=34+6t+7° EW bt Z(t) =34+34+3P EW 


(bandc) Yes. In each case, W contains the zero polynomial, and sums and scalar multiples of polynomials 
in W belong to W. 


Let V be the vector space of functions f : R — R. Show that W is a subspace of V, where 
(a) W={f (x) :f(1) = 0}, all functions whose value at 1 is 0. 
(b) W={f(x):f(3) =f(1)}, all functions assigning the same value to 3 and 1. 


(c) W={f(t) :f(—x) = —f(x)}, all odd functions. 
Let 0 denote the zero function, so 0(x) = 0 for every value of x. 
(a) Ô € W, because 0(1) = 0. Suppose f, g € W. Then f(1) = 0 and g(1) = 0. Also, for scalars a and b, we 
have 
(af + bg)(1) = af (1) + bg(1) = a0 + b0 = 0 
Thus, af + bg € W, and hence W is a subspace. 
(b) Ô € W, because 0(3) = 0 = 0(1). Suppose f, g € W. Then f(3) = f (1) and g(3) = g(1). Thus, for any 
scalars a and b, we have 
(af + bg)(3) = af (3) + bg(3) = af (1) + bg (1) = (af + bg)(1) 
Thus, af + bg € W, and hence W is a subspace. 
(c) 0 € W, because 0(—x) = 0 = —0 = —0(x). Suppose f, g € W. Then f (—x) = —f (x) and g(—x) = —g(x). 
Also, for scalars a and b, 
(af + bg)(—x) = af (~x) + bg(—x) = —af (x) — bg(x) = —(af + bg) (x) 
Thus, ab + gf € W, and hence W is a subspace of V. 


Prove Theorem 4.3: The intersection of any number of subspaces of V is a subspace of V. 


Let {W, : i € I} be a collection of subspaces of V and let W = (W; : i € I). Because each W, is a 
subspace of V, we have 0 € W,, for every i € J. Hence, 0 € W. Suppose u, v € W. Then u, v € W,, for every 
i € I. Because each W, is a subspace, au + bu € W,, for every i € J. Hence, au + bv € W. Thus, W is a 
subspace of V. 


Linear Spans 


4.13. 


Show that the vectors u; = (1,1,1), u = (1,2,3), u, = (1,5,8) span R°. 


We need to show that an arbitrary vector v = (a,b,c) in R? is a linear combination of u4, uz, uz. Set 
v = xu, + yu, + Zuz; that is, set 


(a,b,c) =x(1,1,1) +y(1,2,3) +2(1,5,8) = (x+y +z, x+2y+5z, x+ 3y + 8z) 
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4.14. 


4.15. 


4.16. 


Form the equivalent system and reduce it to echelon form: 


x+ y+ z=a x+y+ z=a x+y+ z=a 
x+2y+5z=b or y+4z=b-a or y+4z=b-a 
x+3y+8z=c 2y+7c=c-—a —z=c-—2b+a 


The above system is in echelon form and is consistent; in fact, 
x= —a + 5b-— 3c, y=3a—7b+4c, z=a+2b-—c 


is a solution. Thus, u4, u, uz span R°. 


Find conditions on a, b, c so that v = (a,b,c) in R? belongs to W = span(u;, uz, u3), where 
u = (1,2,0), U = (-1, 1; 2); uz = (3,0, —4) 


Set v as a linear combination of u, u2, uv; using unknowns x, y, z; that is, set v = xu, + yu, + zuz. This 
yields 
(a,b,c) = x(1,2,0) + y(—1, 1,2) +2(3,0, —4) = (x—y + 3z, 2x+y, 2y- 4z) 


Form the equivalent system of linear equations and reduce it to echelon form: 


x—y+3z=a x—y+3z=a x—y+3z=a 
2x +y =b or 3y — 6z = b — 2a or 3y — 6z = b — 2a 
2y—4z=c 2y—4z=c 0 = 4a — 2b + 3c 


The vector v = (a,b,c) belongs to W if and only if the system is consistent, and it is consistent if and only if 
4a — 2b + 3c = 0. Note, in particular, that u, u2, u, do not span the whole space R°. 


Show that the vector space V = P(t) of real polynomials cannot be spanned by a finite number of 
polynomials. 


Any finite set S of polynomials contains a polynomial of maximum degree, say m. Then the linear span 
span(S) of S cannot contain a polynomial of degree greater than m. Thus, span(S) 4 V, for any finite set S. 


Prove Theorem 4.5: Let S be a subset of V. (i) Then span(S) is a subspace of V containing S. 
(ii) If W is a subspace of V containing S, then span(S) C W. 


(i) Suppose S is empty. By definition, span(S) = {0}. Hence span(S) = {0} is a subspace of V and 
S C span(S). Suppose S is not empty and v € S. Then v = 1v € span(S); hence, S C span(S). Also 
0 = 0v € span(S). Now suppose uw, w € span(S), say 


u = ayu; +: + a,u, = >> au; and w= biwi +: t bew, = X bw; 
i J 
where u;, w; € S and a;, b; E K. Then 
u+ v= >} au; +) bjw; and ku = (= am) = J kaju; 
i 7 i i 


belong to span(S) because each is a linear combination of vectors in S. Thus, span(S) is a subspace of V. 


(ii) Suppose uj, u2,...,u, E S. Then all the u; belong to W. Thus, all multiples aju, azu, ..., a,u, E W, 


and so the sum aju; + azu +-+- + a,u, € W. That is, W contains all linear combinations of elements 
in S, or, in other words, span(S) C W, as claimed. 


Linear Dependence 


4.17. 


Determine whether or not u and v are linearly dependent, where 
(a) u= (1,2), v= (3,—5), (c) u= (1,2,—3), v = (4,5,—6) 
(b) u= (1,—3), v = (—2,6), (d) u = (2,4, —8), v = (3,6, —12) 
Two vectors u and v are linearly dependent if and only if one is a multiple of the other. 


(a) No. (b) Yes; for v = —2u. (c) No. (d) Yes, for v = žu. 
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4.18. 


4.19. 


4.20. 


Determine whether or not u and v are linearly dependent, where 


(a) u=2°+4t—3, v= 4P +8t—6, (b) u=2P —3t+4, v =4° — 3t+2, 


6 eel) 2 Fg st <2 wul 1 1),-[2 2 2 
amd ae | es (a eed || | 4P ETID S aTa 3 3 


Two vectors u and v are linearly dependent if and only if one is a multiple of the other. 


(a) Yes; for v = 2u. (b) No. (c) Yes, for v = —4u. (d) No. 


Determine whether or not the vectors u = (1, 1,2), v = (2,3, 1), w = (4,5,5) in R? are linearly 
dependent. 
Method 1. Seta linear combination of u, v, w equal to the zero vector using unknowns x, y, z to obtain 


the equivalent homogeneous system of linear equations and then reduce the system to echelon form. 
This yields 


1 2 4 0 x+2y+4z=0 _ 
x}1l]+y]/3}4+z]/5] = ]90 or x+3y+5z=0 or T oe 
1 1 5 0 2x+ y+5z=0 PES 


The echelon system has only two nonzero equations in three unknowns; hence, it has a free variable and a 
nonzero solution. Thus, u, v, w are linearly dependent. 


Method 2. Form the matrix A whose columns are u, v, w and reduce to echelon form: 


12 4 1 2 4 12 4 
A=]1 3 5} ~]0 1 lj~ JO 1 1 
2 1 5 0 -3 -3 0 0 0 


The third column does not have a pivot; hence, the third vector w is a linear combination of the first two 
vectors u and v. Thus, the vectors are linearly dependent. (Observe that the matrix A is also the coefficient 
matrix in Method 1. In other words, this method is essentially the same as the first method.) 


Method 3. Form the matrix B whose rows are u, v, w, and reduce to echelon form: 


1 1 2 0 1 2 1 1 2 
B=|2 3 1l}~/]0 1 -3]}~/]0 1 -3 
Aer 5 0 1 -3 0 0 0 


Because the echelon matrix has only two nonzero rows, the three vectors are linearly dependent. (The three 
given vectors span a space of dimension 2.) 


Determine whether or not each of the following lists of vectors in R? is linearly dependent: 
(a) u = (1,2,5), u = (1,3,1), u, = (2,5,7), u4 = (3, 1,4), 

(b) u= (1,2,5), v= (2,5,1), w = (1,5,2), 

(c) u= (1,2,3), v = (0,0,0), w = (1,5,6). 


(a) Yes, because any four vectors in R? are linearly dependent. 


(b) Use Method 2 above; that is, form the matrix A whose columns are the given vectors, and reduce the 
matrix to echelon form: 


1 2 1 1 2 1 1-2 1 
A= |]2 5 5}~1]0 1 3}~]0 1 3 
5 1 2 0 -9 -3 0 0 24 


Every column has a pivot entry; hence, no vector is a linear combination of the previous vectors. Thus, 
the vectors are linearly independent. 


(c) Because 0 = (0,0,0) is one of the vectors, the vectors are linearly dependent. 
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4.21. Show that the functions f(t) = sint, g(t) cost, h(t) = t from R into R are linearly independent. 


Set a linear combination of the functions equal to the zero function 0 using unknown scalars x, y, z; that 
is, set xf + yg + zh = 0. Then show x = 0, y = 0, z = 0. We emphasize that xf + yg + zh = 0 means that, 
for every value of t, we have xf (t) + yg(t) + zA(t) = 0. 


Thus, in the equation x sin t + y cos t + zt = 0: 


(i) Setż=0 to obtain x(0)+y(1)+z(0)=0 or y=0. 
(ii) Sett= 7/2 to obtain x(1) +9(0) + 22/2 =0 or x+ 12/2 =0. 
(iii) Sett=a to obtain x(0) +y(—1) + 2(m) =0 or —y+az=0. 


The three equations have only the zero solution; that is, x = 0, y= 0, z= 0. Thus, f, g, h are linearly 
independent. 


4.22. Suppose the vectors u, v, w are linearly independent. Show that the vectors u + v, u— v, 
u — 2v + w are also linearly independent. 


Suppose x(u + v) + y(u — v) + z(u — 2v + w) = 0. Then 


xu + xv + yu — yv + zu — 2zv + zw = 0 


or 


(x+y+z)u+ (x—y—2z)v+zw=0 
Because u, v, w are linearly independent, the coefficients in the above equation are each 0; hence, 


x+y+z=0, x—y—2z=0, z=0 


The only solution to the above homogeneous system is x = 0, y = 0, z = 0. Thus, u + v, u — v, u — 2v + w 
are linearly independent. 


4.23. Show that the vectors u = (1 +i, 2i) and w= (1, 1 +i) in C? are linearly dependent over the 
complex field C but linearly independent over the real field R. 


Recall that two vectors are linearly dependent (over a field K) if and only if one of them is a multiple of 
the other (by an element in K). Because 


(l+d)w=(14+/(, 1+i=(14+i, 21) =u 


u and w are linearly dependent over C. On the other hand, u and w are linearly independent over R, as no real 
multiple of w can equal u. Specifically, when < is real, the first component of kw = (k, k + ki) must be real, 
and it can never equal the first component 1 + i of u, which is complex. 


Basis and Dimension 

4.24. Determine whether or not each of the following form a basis of R°: 
(a) (1,1,1), (1,0,1); (c) (1,1,1), (1,2,3), (2,—1, 1); 
(b) (1,2,3), (1,3,5), (1,0,1), (2,3,0); (d) (1,1,2), (1,2,5), (5,3,4). 


(a and b) No, because a basis of R? must contain exactly three elements because dim R? = 3. 


(c) The three vectors form a basis if and only if they are linearly independent. Thus, form the matrix whose 
rows are the given vectors, and row reduce the matrix to echelon form: 


1 1 1 1 1 1 1 1 1 
1 2 3|~]J0 1 2)/~]0 1 2 
2. =l 1 0 -3 -l1 0 0 5 


The echelon matrix has no zero rows; hence, the three vectors are linearly independent, and so they do 
form a basis of R°. 
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(d) Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 


1 1 2 1 1 2 1 1 2 
1 2 5j~]O 1 3}~]/0 1 3 
5 3 4 0 -2 -6 000 


The echelon matrix has a zero row; hence, the three vectors are linearly dependent, and so they do not 
form a basis of R°. 


4.25. Determine whether (1,1,1,1), (1,2,3,2), (2,5,6,4), (2,6,8,5) form a basis of R4. If not, find 
the dimension of the subspace they span. 


Form the matrix whose rows are the given vectors, and row reduce to echelon form: 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
B= 1 2 32| JO 12 1j JO 1 2 1} JO 12 1 
25 6 4 03 4 2 0 0 -2 -l 002 1 
2 6 8 5 0 4 6 3 0 0 -2 -I 00 0 0 


The echelon matrix has a zero row. Hence, the four vectors are linearly dependent and do not form a basis of 
R. Because the echelon matrix has three nonzero rows, the four vectors span a subspace of dimension 3. 


4.26. Extend {u, = (1,1, 1,1),u = (2,2,3,4)} to a basis of R*. 
First form the matrix with rows u, and u,, and reduce to echelon form: 
|; 1 1 l~ fo 1 1 | 
22 3 4 00 1 2 
Then w, = (1,1,1,1) and w, = (0,0,1,2) span the same set of vectors as spanned by u; and uw. Let 


uz = (0,1,0,0) and u4 = (0,0,0,1). Then w), uz, W2, u4 form a matrix in echelon form. Thus, they are 
linearly independent, and they form a basis of Rt. Hence, Uy, Uy, Uz, Ug also form a basis of R‘. 


4.27. Consider the complex field C, which contains the real field R, which contains the rational field Q. 
(Thus, C is a vector space over R, and R is a vector space over Q.) 


(a) Show that {1,i} is a basis of C over R; hence, C is a vector space of dimension 2 over R. 


(b) Show that R is a vector space of infinite dimension over Q. 


(a) For any v€C, we have v= a + bi = a(1) + H(i), where a,b € R. Hence, {1,i} spans C over R. 
Furthermore, if x(1) + y(i) = 0 or x + yi = 0, where x, y € R, then x = 0 and y = 0. Hence, {1, i} is 
linearly independent over R. Thus, {1,i} is a basis for C over R. 


(b) It can be shown that z is a transcendental number; that is, z is not a root of any polynomial over Q. 
Thus, for any n, the n + 1 real numbers 1, 2, 2?,..., n” are linearly independent over Q. R cannot be of 
dimension n over Q. Accordingly, R is of infinite dimension over Q. 


4.28. Suppose S = {u,,u2,...,u,} is a subset of V. Show that the following Definitions A and B of a 
basis of V are equivalent: 


(A) S is linearly independent and spans V. 
(B) Every v€ V is a unique linear combination of vectors in S. 
Suppose (A) holds. Because S spans V, the vector v is a linear combination of the u;, say 


u = aU, + azu +-+- + apn and u = bju; + bzu +++: + bpn 
Subtracting, we get 


0 =v- v= (a; — by )uy + (a — by) tty +-+: + (a, — bp), 
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But the u; are linearly independent. Hence, the coefficients in the above relation are each 0: 
a, —b, = 0, a, — b, = 0, ery a, — 6, =9 
Therefore, a; = b,,d) = b),...,a, = b,. Hence, the representation of v as a linear combination of the u; is 
unique. Thus, (A) implies (B). 
Suppose (B) holds. Then S' spans V. Suppose 

0 = cu, + Cat +++ + CU, 
However, we do have 

0 = Ou, + Ow, +--+ + Ou, 


By hypothesis, the representation of 0 as a linear combination of the u; is unique. Hence, each c; = 0 and the 
u; are linearly independent. Thus, (B) implies (A). 


Dimension and Subspaces 


4.29. Find a basis and dimension of the subspace W of R? where 


4.30. 


4.31. 


(a) W={(a,b,c):a+b+c=0}, (b) W={(a,b,c):(a=b=c)} 


(a) Note that W Æ R°, because, for example, (1,2,3) ¢ W. Thus, dim W < 3. Note that u; = (1,0, —1) 
and u, = (0, 1,—1) are two independent vectors in W. Thus, dim W = 2, and so uw, and u, form a basis 
of W. 


(b) The vector u = (1,1,1) € W. Any vector w € W has the form w = (k, k, k). Hence, w = ku. Thus, u 
spans W and dim W = 1. 


Let W be the subspace of R* spanned by the vectors 
u = (1, —2, 5, —3), u = (2,3,1, —4), uz = (3,8, —3, —5) 


(a) Find a basis and dimension of W. (b) Extend the basis of W to a basis of Rî. 


(a) Apply Algorithm 4.1, the row space algorithm. Form the matrix whose rows are the given vectors, and 
reduce it to echelon form: 


1 =2 > =3 lL =2 5 -3 1 =2 5. 3 
A= |2 3 1 -4}~]0 7 -9 2}~10 7 -9 2 
3 8 -3 -5 0 14 -18 4 0 0 0 0 


The nonzero rows (1, —2, 5, —3) and (0,7, —9,2) of the echelon matrix form a basis of the row space 
of A and hence of W. Thus, in particular, dim W = 2. 

(b) We seek four linearly independent vectors, which include the above two vectors. The four vectors 
(1,—2,5, —3), (0,7, —9, 2), (0,0, 1,0), and (0, 0, 0, 1) are linearly independent (because they form an 
echelon matrix), and so they form a basis of R, which is an extension of the basis of W. 


Let W be the subspace of R° spanned by uw, =(1,2,-1,3,4), u, = (2,4,—2,6,8), 
uz = (1,3,2,2,6), uy =(1,4,5,1,8), us = (2,7,3,3,9). Find a subset of the vectors that 
form a basis of W. 


Here we use Algorithm 4.2, the casting-out algorithm. Form the matrix M whose columns (not rows) 
are the given vectors, and reduce it to echelon form: 


1 211 2 1 2 1 1 2 12 1 1 2 
2 43 4 7 0 0 1 2 3 0 0 1 2 3 
M=|-1 -2 2 5 3|~]0O0 0 3 6 S|~ {0 00 0 -4 
3 6 2 1 3 0 0 -1 -2 -3 0000 0 
4 8689 00 2 4 1 0000 0 


The pivot positions are in columns C}, C3, C;. Hence, the corresponding vectors u, u3, us; form a basis of W, 
and dim W = 3. 
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4.32. 


Let V be the vector space of 2 x 2 matrices over K. Let W be the subspace of symmetric matrices. 
Show that dim W = 3, by finding a basis of W. 

Recall that a matrix A = [a,j] is symmetric if A’ = A, or, equivalently, each ay = dji Thus, A = f ‘| 
denotes an arbitrary 2 x 2 symmetric matrix. Setting (1) a = 1, b = 0, d = 0; (ii) a = 0, b= 1, d=0; 
(iii) a = 0, b = 0, d = 1, we obtain the respective matrices: 


1 0 0 1 0 0 
ed Cn) afa} =f 1 
We claim that S = {E,, E}, E3} is a basis of W; that is, (a) S spans W and (b) S is linearly independent. 
(a) The above matrix 4 = p | = ak, + bE, + dE}. Thus, S spans W. 


(b) Suppose xE; + yE, + zE3; = 0, where x, y, z are unknown scalars. That is, suppose 


1 0 0 1 0 0; JO O x y| |0 0 
‘lo elel otelo 1j=lo o] = G-lo 
Setting corresponding entries equal to each other yields x = 0, y = 0, z = 0. Thus, S is linearly independent. 
Therefore, S is a basis of W, as claimed. 


Theorems on Linear Dependence, Basis, and Dimension 


4.33. 


4.34. 


4.35. 


Prove Lemma 4.10: Suppose two or more nonzero vectors V1, U2,..., UV, are linearly dependent. 
Then one of them is a linear combination of the preceding vectors. 


Because the «v; are linearly dependent, there exist scalars a,,...,a,,, not all 0, such that 
a,v, +---+4,,v,, = 0. Let k be the largest integer such that a, # 0. Then 


av, +--+ +a, + Ou.) +--+ +04, = 0 or ay, t+--+a,y,=0 
Suppose k = 1; then a,v, = 0, a, Æ 0, and so v; = 0. But the v, are nonzero vectors. Hence, k > 1 and 
— g! L... gl 
Te = may AU ak Ak-1Vk-ı 


That is, v, is a linear combination of the preceding vectors. 


Suppose S = {v,, v2, ..., U„} spans a vector space V. 
(a) Ifw €p, then {w,v,...,U„} is linearly dependent and spans V. 
(b) If v, is a linear combination of v,,...,v,_;, then S without v; spans V. 


(a) The vector w is a linear combination of the v;, because {v;} spans V. Accordingly, {w, v,,..., Um} is 
linearly dependent. Clearly, w with the v; span V, as the v; by themselves span V; that is, {w, v4, .. -, Um} 
spans V. 


(b) Suppose v; = kvi +-+- + ki-1U;-1. Let u € V. Because {v;} spans V, u is a linear combination of the 
V;’S, SAY U = av, +--+ + 4,,U,,. Substituting for v;, we obtain 
u= Ay Vy to Oy Ug + (hy Oy He + kiU) + aiga Vii H+ + On Un 


= (a; + ajky)vy ++ + (aji + aiki) Vi + Gig Vig Ho + Um 


Thus, {v,,---, U1, U41,--->Um} spans V. In other words, we can delete v; from the spanning set and still 
retain a spanning set. 


Prove Lemma 4.13: Suppose {v,, v2,...,v,} spans V, and suppose {w,,w,...,w,,} is linearly 
independent. Then m < n, and V is spanned by a set of the form 


{W1, W2, -< -3 Wm Uis Visse Ui } 


n—m 


Thus, any n + 1 or more vectors in V are linearly dependent. 
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4.36. 


4.37. 


4.38. 


It suffices to prove the lemma in the case that the v; are all not 0. (Prove!) Because {v,;} spans V, we 
have by Problem 4.34 that 


{wi V.. Un} (1) 
is linearly dependent and also spans V. By Lemma 4.10, one of the vectors in (1) is a linear combination of 
the preceding vectors. This vector cannot be w,, so it must be one of the v’s, say vj. Thus by Problem 4.34, 
we can delete v, from the spanning set (1) and obtain the spanning set 

{Wi Vise Uis Diane Unf (2) 
Now we repeat the argument with the vector w,. That is, because (2) spans V, the set 
{Wy , Wa, Uy 000 Yas Ugo Un} (3) 
is linearly dependent and also spans V. Again by Lemma 4.10, one of the vectors in (3) is a linear 
combination of the preceding vectors. We emphasize that this vector cannot be w; or w, because 
{w ,...,W,,} is independent; hence, it must be one of the v’s, say vp. Thus, by Problem 4.34, we can 
delete v, from the spanning set (3) and obtain the spanning set 
{W1,W2,U15- aor Uji; Vitis vey Ueity Ukttye ees Un} 
We repeat the argument with w3, and so forth. At each step, we are able to add one of the w’s and delete 
one of the v’s in the spanning set. If m < n, then we finally obtain a spanning set of the required form: 
{wy, -3 Wms Ui, prey vu; 
Finally, we show that m > n is not possible. Otherwise, after n of the above steps, we obtain the 


spanning set {w,,...,w,,}. This implies that w,,, is a linear combination of w,,...,w,, which contradicts 
the hypothesis that {w,} is linearly independent. 


Prove Theorem 4.12: Every basis of a vector space V has the same number of elements. 


Suppose {u,,u,...,u,} is a basis of V, and suppose {v,,v2,...} is another basis of V. Because {u;} 
spans V, the basis {v,,v,,...} must contain n or less vectors, or else it is linearly dependent by 
Problem 4.35—Lemma 4.13. On the other hand, if the basis {v,,v),...} contains less than n elements, 
then {u,,u,...,u,} is linearly dependent by Problem 4.35. Thus, the basis {v,, v2, . . .} contains exactly n 
vectors, and so the theorem is true. 


Prove Theorem 4.14: Let V be a vector space of finite dimension n. Then 


(i) Any n+ 1 or more vectors must be linearly dependent. 
(ii) Any linearly independent set S = {u,,u,...u,,} with n elements is a basis of V. 
(iii) Any spanning set T = {v,, v),...,v,} of V with n elements is a basis of V. 


Suppose B = {w,,w>,...,w,} is a basis of V. 


(i) Because B spans V, any n+ 1 or more vectors are linearly dependent by Lemma 4.13. 


(ii) By Lemma 4.13, elements from B can be adjoined to S to form a spanning set of V with n elements. 
Because S already has n elements, S itself is a spanning set of V. Thus, S is a basis of V. 


(iii) Suppose T is linearly dependent. Then some v; is a linear combination of the preceding vectors. By 
Problem 4.34, V is spanned by the vectors in T without v; and there are n — 1 of them. By Lemma 
4.13, the independent set B cannot have more than n — 1 elements. This contradicts the fact that B has 
n elements. Thus, T is linearly independent, and hence T is a basis of V. 


Prove Theorem 4.15: Suppose S spans a vector space V. Then 


(i) Any maximum number of linearly independent vectors in S form a basis of V. 
(ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in 
S. Then the remaining vectors form a basis of V. 


(i) Suppose {v,...,v,,} is a maximum linearly independent subset of S, and suppose w € S. Accord- 
ingly, {v,,...,U,,,w} is linearly dependent. No v, can be a linear combination of preceding vectors. 
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Hence, w is a linear combination of the v;. Thus, w € span(v;), and hence S C span(v;). This leads to 
V = span(S) C span(v;) C V 
Thus, {v;} spans V, and, as it is linearly independent, it is a basis of V. 


(ii) The remaining vectors form a maximum linearly independent subset of S; hence, by (i), it is a basis 
of V. 


4.39. Prove Theorem 4.16: Let V be a vector space of finite dimension and let S = {u,,u,...,u,} be a 
set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be extended 
to a basis of V. 


Suppose B = {w,,w»,...,w,} is a basis of V. Then B spans V, and hence V is spanned by 
SU B = {uj,ty,...,u 
By Theorem 4.15, we can delete from S U B each vector that is a linear combination of preceding vectors to 


obtain a basis B’ for V. Because S is linearly independent, no u, is a linear combination of preceding vectors. 
Thus, B’ contains every vector in S, and S is part of the basis B’ for V. 


r Wy Woy eee Wp} 


4.40. Prove Theorem 4.17: Let W be a subspace of an n-dimensional vector space V. Then dim W < n. 
In particular, if dim W = n, then W= V. 
Because V is of dimension n, any n + 1 or more vectors are linearly dependent. Furthermore, because a 
basis of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly, 
dim W< n. 
In particular, if {w,,...,w,,} is a basis of W, then, because it is an independent set with n elements, it is 
also a basis of V. Thus, W = V when dim W = n. 


Rank of a Matrix, Row and Column Spaces 


4.41. Find the rank and basis of the row space of each of the following matrices: 
1 3 1 -—2 -3 


1 2 0 -=l 

(a) A4A=|2 6 -3 -3], iy se > 2 a 
3 10 -6 —5 2 3 -4 -7 -3 
3 8 1 —7 -8 

(a) Row reduce A to echelon form: 
12 0 -l 1 2 0 -1 
A~|]0 2 -3 —-lij~J]0O 2 -3 -I 
0 4 -6 -2 00 0 0 


The two nonzero rows (1,2,0,—1) and (0,2,—3,—1) of the echelon form of A form a basis for 
rowsp(A). In particular, rank(A) = 2. 


(b) Row reduce B to echelon form: 


1 3 I =2 =3 1 3 1 -2 -3 
Bw 0 1 2 a 2 ee 0 1 2 1 =1 
0 -3 -6 -3 3 000 0 0 
0 -1 -2 -l1 1 000 0 0 


The two nonzero rows (1,3,1, —2,—3) and (0,1,2,1,—1) of the echelon form of B form a basis for 
rowsp(B). In particular, rank(B) = 2. 


4.42. Show that U = W, where U and W are the following subspaces of R°: 
U = span(u;, u, u3) = span(1,1,—1), (2,3,—1), (3,1,—5)} 
W = span(w), w2, w3) = span(1,—1,—3), (3,—2,—8), (2,1,—3)} 
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4.43. 


4.44. 


Form the matrix A whose rows are the u;, and row reduce A to row canonical form: 


1 1 -l 1 1 -l1 1 0 -2 
A=|2 3 -1l]~1]0 1 1}~]0 1 1 
3 1 —-5 0 -2 -2 0 0 0 
Next form the matrix B whose rows are the Wj, and row reduce B to row canonical form: 
1 -1 -3 1 -1 -3 1 0 -2 
B= |3 —2 -8|~|0 1 ljļj~ O 1 1 
2 1 -—3 0 3 3 0 0 0 


Because A and B have the same row canonical form, the row spaces of A and B are equal, and so U = W. 


12123 1 
243 7 7 4 
best o 5. a 5 
3 6 6 15 14 15 


(a) Find rank(M,), for k = 1,2,...,6, where M, is the submatrix of A consisting of the first k 
columns C,,C,...,C, of A. 


(b) Which columns C}; are linear combinations of preceding columns C),...,C,? 
(c) Find columns of A that form a basis for the column space of A. 


(d) Express column C; as a linear combination of the columns in part (c). 
(a) Row reduce A to echelon form: 


12. db 2°33 1 12 12 3 1 
Aw 0013 1 2| ;0 0 13 1 2 
0013 2 5 0000 1 3 
003 9 5 12 000 0 0 0 


Observe that this simultaneously reduces all the matrices M, to echelon form; for example, the first four 
columns of the echelon form of A are an echelon form of M4. We know that rank(M,) is equal to the 
number of pivots or, equivalently, the number of nonzero rows in an echelon form of M,. Thus, 


rank(M,) = rank(M,) = 1, rank(M;) = rank(M,) = 2 
rank(M;) = rank(M6) = 3 


(b) The vector equation x,C; +x C, +---+x,C, = C,,, yields the system with coefficient matrix M, 
and augmented M;,,,. Thus, C,,, is a linear combination of C,,...,C, if and only if 
rank(M,) = rank(M,,1) or, equivalently, if C,,, does not contain a pivot. Thus, each of C3, C4, Ce 
is a linear combination of preceding columns. 

(c) Inthe echelon form of A, the pivots are in the first, third, and fifth columns. Thus, columns C}, C3, C5 
of A form a basis for the columns space of A. Alternatively, deleting columns Cj, C4, Cs from the 
spanning set of columns (they are linear combinations of other columns), we obtain, again, C), C3, Cs. 

(d) The echelon matrix tells us that C, is a linear combination of columns C, and C;. The augmented 
matrix M of the vector equation C4 = xC; + yC, consists of the columns C4, C3, C4 of A which, when 
reduced to echelon form, yields the matrix (omitting zero rows) 


1 T 2 x+y=2 
01 3 og y=3 


Thus, C, = Cı t 3C; = Ci t 3C; t 0C;. 


or x=-l, y=3 


Suppose u = (a,,a7,...,a,) is a linear combination of the rows R,,Ro,...,R, of a matrix 
B = [bj], say u = kR; + Ry +--+ + ky Rm- Prove that 


a; = kibi; + koba +++ + Knbmi» i=1,2,...,n 


where b,;,55;,...,6,,; are the entries in the ith column of B. 
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4.45. Prove Theorem 4.7: Suppose A = |a; 


4.46. 


We are given that u = k,R, + kR, +---+k,,R,,. Hence, 


(a), a, ee 1a) = kı (bii, sie sbin) a eee Kin(Pmts tee bmn) 
= (ky by a i KnOmis sa Ady, a eta kmbmn) 
Setting corresponding components equal to each other, we obtain the desired result. 


] and B = [b,| are row equivalent echelon matrices with 


ij ij 


respective pivot entries 
aijo aj eo Arj, and Diko bps- , Dsk, 


(pictured in Fig. 4-5). Then A and B have the same number of nonzero rows—that is, r = s—and 


their pivot entries are in the same positions; that is, 7, = k,,j. = ko,...,j, = kp 
aj * k k KK x bip x k k KK OX 
a SR: See ee SE b * k OK * 
A= 2j ! b= 2k 
ay Eo by * * 
Figure 4-5 


Clearly A = 0 if and only if B = 0, and so we need only prove the theorem when r > 1 and s > 1. We 
first show that 7, = k,. Suppose j, < k,. Then the j,th column of B is zero. Because the first row R* of A is in 
the row space of B, we have R* = c,R; + c;R, +--+: + CmRm, where the R; are the rows of B. Because the 
j,th column of B is zero, we have 


aiji =c,0+c,0+---+c¢,0=0 


But this contradicts the fact that the pivot entry a,;, # 0. Hence, jı > kı and, similarly, k; > jı. Thus j; = k. 

Now let 4’ be the submatrix of A obtained by deleting the first row of A, and let B’ be the submatrix of B 
obtained by deleting the first row of B. We prove that A’ and B’ have the same row space. The theorem will 
then follow by induction, because A’ and B’ are also echelon matrices. 

Let R = (a), a),...,a,,) be any row of A’ and let R,,...,R,, be the rows of B. Because R is in the row 
space of B, there exist scalars d,,...,d,, such that R = d,|R, + d,R,+---+d,,R,,. Because A is in echelon 
form and R is not the first row of A, the /, th entry of R is zero: a; = 0 fori = 7, = k. Furthermore, because B is 
in echelon form, all the entries in the ‘,th column of B are O except the first: bj, #0, but 
box, = 0,- , Ping, = 0. Thus, 


0 = a, = dibir, +a,0+---+d,,0 = dibir 


Now b,;, # 0 and so d; = 0. Thus, R is a linear combination of R3, . . . , R, and so is in the row space of B’. 
Because R was any row of 4’, the row space of 4’ is contained in the row space of B’. Similarly, the row 
space of B’ is contained in the row space of A’. Thus, 4’ and B’ have the same row space, and so the theorem 
is proved. 


Prove Theorem 4.8: Suppose A and B are row canonical matrices. Then A and B have the same 
row space if and only if they have the same nonzero rows. 


Obviously, if A and B have the same nonzero rows, then they have the same row space. Thus we only 
have to prove the converse. 

Suppose A and B have the same row space, and suppose R +Æ 0 is the ith row of A. Then there exist 
scalars c,,...,c, such that 


R= cR +R +--+ HR, (1) 


where the R; are the nonzero rows of B. The theorem is proved if we show that R = R;; that is, that c; = 1 but 
c, = 0 for k £ i. 
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4.47. 


4.48. 


4.49. 


Let a;,, be the pivot entry in R—that is, the first nonzero entry of R. By (1) and Problem 4.44, 


ip? 
Gy, = Cy dy, + Caba, +++ + Coby, (2) 
But, by Problem 4.45, bi. is a pivot entry of B, and, as B is row reduced, it is the only nonzero entry in the jth 
column of B. Thus, from (2), we obtain a;, = c;bj. However, a;, = 1 and bj, = 1, because A and B are row 
reduced; hence, c; = 1. 
Now suppose k # i, and b,, is the pivot entry in R}. By (1) and Problem 4.44, 


Gy, = Cibi, + Caba, Pee F Cebi (3) 


Tk 
Because B is row reduced, by, is the only nonzero entry in the jth column of B. Hence, by (3), aj, = Cbi, 
Furthermore, by Problem 4.45, ayy, is a pivot entry of A, and because A is row reduced, ay, = 0. Thus, 
cby, = 9, and as by, = 1, c, = 0. Accordingly R = R;, and the theorem is proved. 


Prove Corollary 4.9: Every matrix A is row equivalent to a unique matrix in row canonical 
form. 


Suppose A is row equivalent to matrices A, and A,, where A, and A, are in row canonical form. Then 
rowsp(A) = rowsp(A,) and rowsp(A) = rowsp(A,). Hence, rowsp(A,) = rowsp(A,). Because A, and A, are 
in row canonical form, A, = A, by Theorem 4.8. Thus, the corollary is proved. 


Suppose RB and AB are defined, where R is a row vector and A and B are matrices. Prove 
(a) RB is a linear combination of the rows of B. 

(b) The row space of AB is contained in the row space of B. 

(c) The column space of AB is contained in the column space of A. 


(d) IfCisacolumn vector and AC is defined, then AC is a linear combination of the columns 
of A. 


(e) rank(AB) < rank(B) and rank(AB) < rank(A). 


(a) Suppose R= (a), a),--.,@,) and B = [b;]. Let B,,...,B,, denote the rows of B and B!,...,B" its 
columns. Then 


RB = (RB', RB’,..., RB") 
= (abi; + andy Htt + AyD, -3 OD + aban + +++ + Pn) 
= a; (bii, B42, ---, Bin) + az (bzi, B22, --- Bon) +++ + n(n Oo, - -> Omn) 
= aB; + aBa + +++ + An Bm 
Thus, RB is a linear combination of the rows of B, as claimed. 
(b) The rows of AB are R,B, where R; is the ith row of A. Thus, by part (a), each row of AB is in the row 
space of B. Thus, rowsp(AB) C rowsp(B), as claimed. 
(c) Using part (b), we have colsp(AB) = rowsp(4B)' = rowsp(B7A") C rowsp(A’) = colsp(A). 
(d) Follows from (c) where C replaces B. 


(e) The row space of AB is contained in the row space of B; hence, rank(AB) < rank(B). Furthermore, the 
column space of AB is contained in the column space of A; hence, rank(AB) < rank(A). 


Let A be an n-square matrix. Show that A is invertible if and only if rank(A) = n. 


Note that the rows of the n-square identity matrix J, are linearly independent, because J, is in echelon 
form; hence, rank(J,,) = n. Now if A is invertible, then A is row equivalent to J,,; hence, rank(A) = n. But if 
A is not invertible, then A is row equivalent to a matrix with a zero row; hence, rank(A) < n; that is, A is 
invertible if and only if rank(A) = n. 
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Applications to Linear Equations 


4.50. Find the dimension and a basis of the solution space W of each homogeneous system: 


x+2y+2z—-—s+3t=0 x+2y+ z-2t=0 x+ y+2z=0 
x+2yv+3z+s+ t=0 2x + 4y+4z-—3t=0 2x + 3yv+3z=0 
3x+ 6y + 82+s5+5t=0 3x + 6y+ 7z-—4t=0 x+3y+5z=0 
(a) (b) (c) 
(a) Reduce the system to echelon form: 
x+2y+2z— s+3t=0 x+2y+2z— s+3t=0 
z+2s—2t=0 or z+2s—2t=0 
2z+ 4s —4t=0 


The system in echelon form has two (nonzero) equations in five unknowns. Hence, the system has 
5 — 2 = 3 free variables, which are y, s, t. Thus, dim W = 3. We obtain a basis for W: 

(1) Sety=1,s=0,t=0 to obtain the solution vı = (—2,1,0,0, 0). 

(2) Sety=0,s=1,t=0 to obtain the solution v = (5,0, —2, 1,0). 

(3) Sety=0,s=0,t=1 to obtain the solution v3 = (—7,0,2,0, 1). 


The set {v,, v2, v3} is a basis of the solution space W. 


(b) (Here we use the matrix format of our homogeneous system.) Reduce the coefficient matrix A to 
echelon form: 


12 1 -2 | ee 12 1 =-2 
A=|2 4 4 -3)}~1]0 0 2 1j}~]0 0 2 1 
3 6 7 —4 00 4 000 0 


This corresponds to the system 
x+2y+2z—2t=0 
2z+ t=0 
The free variables are y and ¢, and dim W = 2. 
(i) Sety = 1, z = 0 to obtain the solution u, = (—2, 1,0,0). 
(ii) Set y = 0, z = 2 to obtain the solution u, = (6,0, —1,2). 
Then {w,, u2} is a basis of W. 
(c) Reduce the coefficient matrix A to echelon form: 


1 1 2 1 1 2 1 1 2 
A=|]2 3 3|~]O 1 —Iij~Ē~jO 1 -1 
I 3 5 0 2 3 0 0 5 


This corresponds to a triangular system with no free variables. Thus, 0 is the only solution; that is, 
W = {0}. Hence, dim W = 0. 


4.51. Find a homogeneous system whose solution set W is spanned by 
{uy,Uz,u3} = {(1, —2, 0,3), (1,—1,—1,4), (1,0, —2,5)} 


Let v = (x,y,z, t). Then v € W if and only if v is a linear combination of the vectors u, uy, uz that span 
W. Thus, form the matrix M whose first columns are u4, u3, uz and whose last column is v, and then row 
reduce M to echelon form. This yields 


1 1 x 1 1 1 x 1 1 1 x 
wa = 0 y - 0 1 2 2x+y 0 1 2 2x+y 

0 -l —2 z 0 -l -2 Z 0 0 0 2x+y+z 

3 5 if 0 1 2 —3x+t 0 0 0 —5x-y+t 
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Then v is a linear combination of u,, u2, u3 if rank(M) = rank(A), where A is the submatrix without column 
v. Thus, set the last two entries in the fourth column on the right equal to zero to obtain the required 
homogeneous system: 


2x+y+zZ =0 
5x+y —t=0 
Let x;,,%;,,---,%;, be the free variables of a homogeneous system of linear equations with n 
unknowns. Let v, be the solution for which x, = 1, and all other free variables equal 0. Show that 
the solutions v,, v2,..., u, are linearly independent. 
Let A be the matrix whose rows are the v,. We interchange column 1 and column į}, then column 2 and 
column 7,,...,then column k and column i,, and we obtain the k x n matrix 
1 0 0 0 0 Cl ky Cin 
B= U, c] = 0 1 0 0 0 C2 k+1 Con 
0 0 0 O l Che Ckn 


The above matrix B is in echelon form, and so its rows are independent; hence, rank(B) = k. Because A and 
B are column equivalent, they have the same rank—rank(A) = k. But A has k rows; hence, these rows (i.e., 
the v;) are linearly independent, as claimed. 


Sums, Direct Sums, Intersections 


4.53. 


4.54. 


Let U and W be subspaces of a vector space V. Show that 

(a) U+V isa subspace of V. 

(b) U and W are contained in U + W. 

(c) U +W is the smallest subspace containing U and W; that is, U + W = span(U, W). 
(d) W+W=W. 


(a) Because U and W are subspaces, 0 € U and 0 € W. Hence, 0 = 0 + 0 belongs to U + W. Now suppose 
v, v! E€ U + W. Then v = u + w and v' = u' + v', where u,u' € U and w,w' € W. Then 


av + bv' = (au + bu’) + (aw + bw) €E U+W 


Thus, U + W is a subspace of V. 


(b) Letu € U. Because W is a subspace, 0 € W. Hence, u = u + 0 belongs to U + W. Thus, U C U + W. 
Similarly, W C U + W. 


(c) Because U + W is a subspace of V containing U and W, it must also contain the linear span of U and 
W. That is, span(U,W) C U + W. 
On the other hand, if v € U + W, then v = u + w = lu + lw, where u € U and w € W. Thus, v is 
a linear combination of elements in U U W, and so v € span(U, W). Hence, U + W C span(U, W). 
The two inclusion relations give the desired result. 


(d) Because W is a subspace of V, we have that W is closed under vector addition; hence, W + W C W. By 
part (a), W C W + W. Hence, W + W = W. 


Consider the following subspaces of R°: 


U = span(u,,u2,u3) = span{(1,3,—2,2,3), (1,4,-3,4,2), (2,3,-1,-2,9)} 
W = span(w,,w>,w3) = span{(1,3,0,2,1), (1,5,-6,6,3), (2,5,3,2,1)} 


Find a basis and the dimension of (a) U + W, (b) UN W. 
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(a) U +W is the space spanned by all six vectors. Hence, form the matrix whose rows are the given six 
vectors, and then row reduce to echelon form: 


{3-3 2 3 i 3.3 2 3 13 3 3 
I4 -3 42 0 1 -1 2 <1 © i =i 2 =í 
23 l G i -3 3 -6 3 00 10 -1 
13 0 21ıl~lo © 2 0 21° |/0 & 00 @ 
15 -6 6 3 0 2-4 4 «0 00 00 0 
25 3 2 1 0 -1 7 =2 =5 00 00 0 


The following three nonzero rows of the echelon matrix form a basis of U N W: 
(1,3, —2, 2,2,3), (0,1, —1,2,—1), (0,0, 1,0, —1) 
Thus, dim(U + W) = 3. 


(b) Letv = (x,y,z,s,t) denote an arbitrary element in RS. First find, say as in Problem 4.49, homogeneous 
systems whose solution sets are U and W, respectively. 
Let M be the matrix whose columns are the u; and v, and reduce M to echelon form: 


1 1 2. Xt 1 1 2 x 

3 4 3 y 0 1 -3 —3x+y 
M= ]|-2 -3 -l z}/~1]0 0 0 —x+y+z 

2 4 -2 s 0 0 0 4x-—2y+s 

3 2 9 f 0 0 0 -6x+y+t 


Set the last three entries in the last column equal to zero to obtain the following homogeneous system whose 
solution set is U: 


—x+y+z=0, 4x —2y+s5s=0, —6x+y+t=0 


Now let M’ be the matrix whose columns are the w; and v, and reduce M’ to echelon form: 


1 12 x 1 1 2 x 

3 5 5 y 0 2 -l1 —3x+y 
M={0 -6 3 z|~]0 0 0 —9x+3y+z 

2 6 2 s 00 0 4x-2y+s 

1 3 1 ¢ 0 0 0 2x—ytt 


Again set the last three entries in the last column equal to zero to obtain the following homogeneous system 
whose solution set is W: 


—9+34+2z=0, 4x —2y+s=0, 2x-y+t=0 


Combine both of the above systems to obtain a homogeneous system, whose solution space is U N W, and 
reduce the system to echelon form, yielding 


—x+y+ z=0 
2y+4z+ s=0 
8z+5s+2t=0 
s—2t=0 


There is one free variable, which is ¢; hence, dim(U N W) = 1. Setting t= 2, we obtain the solution 
u = (1,4, —3,4,2), which forms our required basis of U N W. 


Suppose U and W are distinct four-dimensional subspaces of a vector space V, where dim V = 6. 
Find the possible dimensions of UM W. 


Because U and W are distinct, U + W properly contains U and W; consequently, dim(U + W) > 4. 
But dim(U + W) cannot be greater than 6, as dim V =6. Hence, we have two possibilities: (a) 
dim(U + W) = 5 or (b) dim(U + W) = 6. By Theorem 4.20, 


dim(U N W) = dim U + dim W — dim(U + W) = 8 — dim(U + W) 
Thus (a) dim(U N W) = 3 or (b) dim(UN W) = 2. 
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4.57. 


4.58. 


Let U and W be the following subspaces of R°: 
U = {(a,b,c):a =b = c} and W = {(0,b,c)} 
(Note that W is the yz-plane.) Show that R? = U@ W. 

First we show that U N W = {0}. Suppose v = (a,b,c) € U N W. Then a = b = c and a = 0. Hence, 
a = 0, b = 0, c = 0. Thus, v = 0 = (0,0,0). 

Next we show that R? = U + W. For, if v = (a,b,c) € R°, then 

v= (a,a,a) + (0, b—a, c—a) where (a,a,a)€U and (0, b-a, c-a) E W 

Both conditions U N W = {0} and U + W = R? imply that R? = U@ W. 


Suppose that U and W are subspaces of a vector space V and that S = {u;} spans U and S’ = {w;} 
spans W. Show that SUS’ spans U + W. (Accordingly, by induction, if S; spans W;, for 
i=1,2,...,n, then Sı U... U S, spans Wi +---+ W,,.) 


Let ve U +W. Then v=u+w, where uc U and w € W. Because S spans U, u is a linear 
combination of u;, and as S’ spans W, w is a linear combination of w;; say 


u = aju; + au; +--+ + a; and v= biw; + byw, + + bew; 


where a;, b; € K. Then 


v =u +W = au; +a; +: + a,u, + bw, + bow, +: + bw, 
g 


Accordingly, SUS’ = {u;, w;} spans U + W. 


Prove Theorem 4.20: Suppose U and V are finite-dimensional subspaces of a vector space V. Then 
U + W has finite dimension and 


dim(U + W) = dim U + dim W — dim(U N W) 


Observe that UMW is a subspace of both U and W. Suppose dimU=m, dim W =n, 
dim(U N W) =r. Suppose {v,,...,v,} is a basis of U N W. By Theorem 4.16, we can extend {v;} to a 
basis of U and to a basis of W; say 


{Up 46-0) Ups Uy, ++ Uy yf and {Up ,- 005 U Wyse ee Way 


are bases of U and W, respectively. Let 


B= {Upp . 0s Ups Uy ey Uppy Wig) Wy 


Note that B has exactly m + n — r elements. Thus, the theorem is proved if we can show that B is a basis 
of U + W. Because {v;,u;} spans U and {v;,w,} spans W, the union B = {v,,u;,w,} spans U + W. Thus, it 
suffices to show that B is independent. 


Suppose 
ajv +--+ + a,v, + buy +--+ + by Um HCW, Ho Car Wnr = 9 (1) 
where aj, b;, c, are scalars. Let 
v= ayy +++ + a,v, + bit + +++ + By pny (2) 
By (1), we also have 
Vv = —CyWy — +++ — Cy Wap (3) 


Because {v;,u;} C U, v € U by (2); and as {w,} C W, v € W by (3). Accordingly, v € U N W. Now {v;} is 
a basis of U N W, and so there exist scalars d,,...,d, for which v = div; +---+d,v,. Thus, by (3), we have 


dı Yd d,v, Hewi +: F Cy Wap = 0 
But {v;,w;} is a basis of W, and so is independent. Hence, the above equation forces c; = 0,...,C,_, = 0. 
Substituting this into (1), we obtain 

au tes + G,¥, 4 byu Pate Dn —pUm—r =0 


But {vu} is a basis of U, and so is independent. Hence, the above equation forces a; = 
0,...,a4, =0,5, =0,...,5,,_, = 0. 

Because (1) implies that the a;, b;, cą are all 0, B = {v;,u;,w;,} is independent, and the theorem is 
proved. 


«= 


4.59. Prove Theorem 4.21: V = U @ W if and only if (i) V = U + W, (ii) UN W = {0}. 


4.60. 
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Suppose V = U @ W. Then any v € V can be uniquely written in the form v = u + w, where u € U and 


w € W. Thus, in particular, V = U + W. Now suppose v € U N W. Then 


(1) v=v+0, where ve U, OCW, (2) v=0+v, where 0€ U, vEW. 


Thus, v = 0 + 0 = 0 and U N W = {0}. 


On the other hand, suppose V = U + W and U N W = {0}. Let v € V. Because V = U + W, there exist 
u E€ U and w € W such that v = u + w. We need to show that such a sum is unique. Suppose also that 


v=u'+w, where u’ € U and w € W. Then 
u+w=u +w, and so u—u =w -w 
But u — u’ € U and w' — w € W; hence, by U N W = {0}, 
u-—u=0, w—w=0, and so u=u, w=w 


Thus, such a sum for v € V is unique, and V = U Ẹ W. 


Prove Theorem 4.22 (for two factors): Suppose V = U @ W. Also, suppose S = {uy,.. 


S’ = {w,,...,w,} are linearly independent subsets of U and W, respectively. Then 
(a) The union SUS" is linearly independent in V. 

(b) If S and S’ are bases of U and W, respectively, then SUS’ is a basis of V. 
(c) dim V = dim U + dim W. 


(a) Suppose aju; +--+ + amum + biwi +--+ + b,w, = 0, where a;, b, are scalars. Then 


(aiui + +++ + amum) + (biwi +--+ + baw) =0=0+0 


m m 


., Um} and 


where 0, ayuy +-+- + amum E U and 0, biw; +---+5,w, € W. Because such a sum for 0 is unique, 


this leads to 


au, +: F anum = 0 and biwi +--+: +b,w, = 0 


Because S; is linearly independent, each a; = 0, and because S, is linearly independent, each b; = 0. 


Thus, S = S, US, is linearly independent. 


(b) By part (a), S = S, US, is linearly independent, and, by Problem 4.55, S = S; US, spans V = U + W. 


Thus, S = Sı US) is a basis of V. 
(c) This follows directly from part (b). 


Coordinates 
4.61. Relative to the basis S = {u,,u.} = {(1,1), (2,3)} of R’, find the coordinate vector of v, where 


(a) v = (4, —3), (b) v = (a,b). 
In each case, set 
v= xuy + yu = x(1,1) + y(2,3) = (x +2y, x +3y) 


and then solve for x and y. 


(a) We have 
_ , x+2y= 4 
(4, —3) = (x + 2y, x + 3y) or x+3y=-3 
The solution is x = 18, y = —7. Hence, [v] = [18, —7]. 
(b) We have 
_ x+2y=a 
(a,b) = (x + 2y, x+ 3y) or apa 


The solution is x = 3a — 2b, y = —a + b. Hence, [v] = [3a — 2b, a+b]. 
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4.64. 


Find the coordinate vector of v = (a,b,c) in R° relative to 
(a) the usual basis E = {(1,0,0), (0,1,0), (0,0, 1)}, 
(b) the basis S = {u,,uy,u3} = {(1,1,1), (1,1,0), (1,0,0)}. 


(a) Relative to the usual basis Æ, the coordinates of [v]; are the same as v. That is, [v]; = [a, b, c]. 


(b) Set vas a linear combination of u,, uv, uz using unknown scalars x, y, z. This yields 


a 1 1 1 x+y+z=a 
b| =x|1|+y|1| +z|0 or x+y =b 
c 1 0 0 x =c 


Consider the vector space P3(t) of polynomials of degree <3. 
(a) Show that S = {(t— 1), (t— 1)’, t—1, 1} isa basis of P(t). 
(b) Find the coordinate vector [v] of v = 3° — 4? + 2t — 5 relative to S. 


(a) The degree of (t — 1)‘ is k; writing the polynomials of S in reverse order, we see that no polynomial is 
a linear combination of preceding polynomials. Thus, the polynomials are linearly independent, and, 
because dim P(t) = 4, they form a basis of P;(7). 


(b) Set vas a linear combination of the basis vectors using unknown scalars x, y, z, s. We have 
v= 30447 4+ 2t—5 =x(t— 1)? +y(t— 1) + zt- 1)+ s1) 


=x(F —3f +3t— 1) + p(? — 2# +1) +-2(t-— 1) +5(1) 
3 2 


= xP — 3xÊ + 3xt — x ye — 2yt+y+zt—-z+s 


= xP +(-3x+y)P? + (3x — 2y +z)t+(—x+y-z+s) 


Then set coefficients of the same powers of t equal to each other to obtain 


x= 3, —3x+y = 4, 3x — 2y +z =2, x+y-z+s=-—5 
Solving the system yields x = 3, y = 13, z = 19, s = 4. Thus, [v] = [3, 13, 19,4]. 


2 
4 


; 1 1 1 -l 1 -l 1 0 
o ass EAH AG oD 


(b) the usual basis B= {| AR lela all il} 


(a) Set A as a linear combination of the basis vectors using unknown scalars x, y, z, t as follows: 
_ 42 3) Jl 1 1 -1l 1 -1l 1 0|  |x+z+t x-y-z 
ar ae i] +f altzo al+elo Ja x+y f 


Set corresponding entries equal to each other to obtain the system 


: 3]. : 
Find the coordinate vector of A = | _ | in the real vector space M = M,, relative to 


x+z+t=2, x—y-z=3, x+y = 4, x= -7 


Solving the system yields x = —7, y = 11, z = —21, t = 30. Thus, [A], = [—7, 11, —21, 30]. (Note that 
the coordinate vector of A is a vector in R, because dim M = 4.) 


(b) Expressing A as a linear combination of the basis matrices yields 


2 3] fi o 01 0 0 0 0] fx y 

k ala o| +> o| +l o| +o a ] 
Thus, x = 2, y = 3, z = 4, t = —7. Hence, [A] = [2,3,4, —7], whose components are the elements of A 
written row by row. 


ae )—_— CHAPTER 4 Vector Spaces 


4.65. 


Remark: This result is true in general; that is, if A is any m x n matrix in M = M,» then the 


coordinates of A relative to the usual basis of M are the elements of A written row by row. 


In the space M = M, ;, determine whether or not the following matrices are linearly dependent: 
1 2 3 24 7 12 5 

a=l; 0 T B=| 44 1 oh gal 2 d 

If the matrices are linearly dependent, find the dimension and a basis of the subspace W of M 

spanned by the matrices. 


The coordinate vectors of the above matrices relative to the usual basis of M are as follows: 


[A] = [1,2,3, 4,0,5], [B] = [2,4,7, 10, 1, 13], [C] = [1,2,5,8,2, 11] 
Form the matrix M whose rows are the above coordinate vectors, and reduce M to echelon form: 
123 40 5 123 4 0 5 
M=]|2 4 7 10 1 13}/~]0 0 1 2 1 3 
12 5 8 2 11 000 0 0 0 


Because the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B], [C] span a space of 
dimension two, and so they are linearly dependent. Thus, A, B, C are linearly dependent. Furthermore, 
dim W = 2, and the matrices 


fro 4 a apace a 
"I=T4 0 5 2-2 1 3 


corresponding to the nonzero rows of the echelon matrix form a basis of W. 


Miscellaneous Problems 


4.66. 


4.67. 


4.68. 


Consider a finite sequence of vectors S = {v,,v,...,v,}. Let T be the sequence of vectors 
obtained from S by one of the following ‘‘elementary operations’’: (i) interchange two vectors, 
(ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that S 
and T span the same space W. Also show that T is independent if and only if S is independent. 


Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On the other 
hand, each operation has an inverse of the same type (Prove!); hence, the vectors in S are linear combinations 
of vectors in T. Thus S and T span the same space W. Also, T is independent if and only if dim W = n, and this 
is true if and only if S is also independent. 


Let A = [a,] and B = [b;] be row equivalent m x n matrices over a field K, and let v,...,v, be 
any vectors in a vector space V over K. Let 

Uy = AV, T A42U2 T `t T AA Uy wi = by, + bizt +++ + Diy, 

Uy = Az, Vy T A2202 T° T Aan Uy Wy = by V + bava + +++ + Day, 

Um = Am1 Y1 T Anz V2 aeara Amn Yn Wm = bmi v T Binz V2 ee Ding 


Show that {u;} and {w,} span the same space. 


Applying an ‘‘elementary operation’’ of Problem 4.66 to {u;} is equivalent to applying an elementary 
row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence 
of elementary row operations; hence, {w;} can be obtained from {u;} by the corresponding sequence of 
operations. Accordingly, {u;} and {w,} span the same space. 

Let v,,..., v, belong to a vector space V over K, and let P = (a; be an n-square matrix over K. Let 
Wy = Ay Vy + a120 Fee + Ayy_Vy, ESA Wy = Ap Vy F Ay Vz F `+ F Ann Vy 
(a) Suppose P is invertible. Show that {w;} and {v;} span the same space; hence, {w;} is 
independent if and only if {v;} is independent. 


(b) Suppose P is not invertible. Show that {w;} is dependent. 
(c) Suppose {w;} is independent. Show that P is invertible. 
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4.69. 


4.70. 


(a) Because P is invertible, it is row equivalent to the identity matrix J. Hence, by Problem 4.67, {w,} and 
{v;} span the same space. Thus, one is independent if and only if the other is. 


(b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means that {w;} spans 
a space that has a spanning set of less than n elements. Thus, {w,} is dependent. 


(c) This is the contrapositive of the statement of (b), and so it follows from (b). 


Suppose that 4,,A>,... are linearly independent sets of vectors, and that 4; C A, C.... Show 
that the union A = A, UA, U... is also linearly independent. 


Suppose A is linearly dependent. Then there exist vectors v,,...,v, E€ A and scalars a,,...,a, E K, not 
all of them 0, such that 
aju ta), +--+ a,v = 0 (1) 


Because A = UA, and the v; € A, there exist sets Ai, ,---,4; such that 


i, 
v, E€ Áis v €A,, E Vna E Aj, 
Let k be the maximum index of the sets A; : k = max(i;,...,i„). It follows then, as A} C A, C ..., that 


each Ai is contained in A,. Hence, v),v7,..., U, E Ax, and so, by (1), A, is linearly dependent, which 
contradicts our hypothesis. Thus, Æ is linearly independent. 


Let K be a subfield of a field L, and let L be a subfield of a field £. (Thus, K C L C E, and K isa 
subfield of E.) Suppose F is of dimension n over L, and L is of dimension m over K. Show that E is 
of dimension mn over K. 


Suppose {v,,...,v,} is a basis of E over L and {a,,...,a,,} is a basis of L over K. We claim that 
{ajv,:i=1,...,m,j=1,...,n} is a basis of E over K. Note that {a;v;} contains mn elements. 
Let w be any arbitrary element in E. Because {v,,...,v,} spans E over L, w is a linear combination of 


the v; with coefficients in L: 
w = by, + byt) +--+ + by Un, b EL (1) 
Because {a,,...,q,,} spans L over K, each b, € L is a linear combination of the a; with coefficients in K: 


bi = kay + kiaz + +++ + kimam 
by = kna; + knaz + +++ + Komam 


b, = kaia F ka eee Kn Gn 


where k; E€ K. Substituting in (1), we obtain 


w= (kiia; Pest kimam) V + (Ayia) apersi komam) v2 E (kna raic kumam) Vn 
= kav Semia kimam F kaj aperea Komam eege Ky 41% ae ee Kamin &n 
= 2 klav) 
ty 


where ki € K. Thus, w is a linear combination of the a;0; with coefficients in K; hence, {a;v;} spans E over 
K. 

The proof is complete if we show that {a;v;} is linearly independent over K. Suppose, for scalars 
x; E K, we have 7, ,x;;(a;v;) = 0; that is, 


(x11410 X20420 +> +X1m%n01) perit (Xn141 Up F Xman to > Kilts) =0 
or 
(x114; + X24 +: F Ximam) V pir (n141 F Xma +++ 4 Xumam) Vn =0 
Because {v,,...,v,} is linearly independent over L and the above coefficients of the v; belong to L, each 


coefficient must be 0: 


XypQy + X12423 + +++ FX ny = 9, a XpypAy + Xna + +++ + Xp = O 


GE D CHAPTER 4 Vector Spaces 


But {a,,..-,4,,} is linearly independent over K; hence, because the x; € K, 


X14, =0, xX. =0, ..., x, =9, ..., xX, = 0, x. = 0, ..., x,,, = 0 


Accordingly, {a;0;} is linearly independent over K, and the theorem is proved. 


SUPPLEMENTARY PROBLEMS 


Vector Spaces 


4.71. Suppose u and v belong to a vector space V. Simplify each of the following expressions: 
(a) E; = 4(5u — 6v) + 2(3u + v), (c) E, = 6(3u + 2v) + 5u — Tv, 
(b) E, = 5(2u — 3v) +4(7v + 8), (d) E, = 3(5u + 2/v). 
4.72. Let V be the set of ordered pairs (a, b) of real numbers with addition in V and scalar multiplication on V 
defined by 
(a,b) + (c,d) = (a+c, b+d) and k(a,b) = (ka, 0) 
Show that V satisfies all the axioms of a vector space except [M4]—that is, except lu = u. Hence, [M4] is 
not a consequence of the other axioms. 
4.73. Show that Axiom [A4] of a vector space V (that u + v = v + u) can be derived from the other axioms for V. 
4.74. Let V be the set of ordered pairs (a,b) of real numbers. Show that V is not a vector space over R with 
addition and scalar multiplication defined by 
(i) (a,b) + (c,d) =(a+d, b+ c) and k(a,b) = (ka, kb), 
Gi) (a,b) + (c,d) = (a+c, b+ d) and k(a,b) = (a,b), 
Gii) (a,b) + (c,d) = (0,0) and k(a, b) = (ka, kb), 
(iv) (a,b) + (c,d) = (ac, bd) and k(a, b) = (ka, kb). 
4.75. Let V be the set of infinite sequences (a4, az, ...) in a field K. Show that V is a vector space over K with 
addition and scalar multiplication defined by 
(1, 4,...) + (bi, b2,...) = (a, +5), a +b, ...) and k(aj,az,...) = (kay, kaz,...) 
4.76. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u, w) where u € U and 
w € W. Show that V is a vector space over K with addition in V and scalar multiplication on V defined by 
(u,w) + (u, w) = (u+ uw, w+ w’) and k(u,w) = (ku, kw) 
(This space V is called the external direct product of U and W.) 
Subspaces 
4.77. Determine whether or not W is a subspace of R? where W consists of all vectors (a,b,c) in R? such that 
(a)a=3b, (b)a<b<c, ()ab=0, )atb+c=0, ()b=a@, (f)a=2b=3e. 
4.78. Let V be the vector space of n-square matrices over a field K. Show that W is a subspace of V if W consists 
of all matrices 4 = [a;;] that are 
(a) symmetric (47 = A or a, = aji), (b) (upper) triangular, (c) diagonal, (d) scalar. 
4.79. Let AX = B be a nonhomogeneous system of linear equations in n unknowns; that is, B # 0. Show that the 
solution set is not a subspace of K”. 
4.80. Suppose U and W are subspaces of V for which U U W is a subspace. Show that U C W or W C U. 
4.81. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace of V 


where W consists of all: (a) bounded functions, (b) even functions. [Recall that f:R — R is bounded if 
JM E R such that Vx € R, we have | f(x)| < M; and f(x) is even if f(—x) =f (x), Vx € R.] 
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4.82. 


Linear 
4.83. 


4.84. 


4.85. 


4.86. 


4.87. 


4.88. 


Linear 
4.89. 


4.90. 


4.91. 


4.92. 


4.93. 


4.94. 


4.95. 


4.96. 


Let V be the vector space (Problem 4.75) of infinite sequences (a4, a2, ...) in a field K. Show that W is a 
subspace of V if W consists of all sequences with (a) 0 as the first element, (b) only a finite number of 
nonzero elements. 


Combinations, Linear Spans 

Consider the vectors u = (1,2,3) and v = (2,3, 1) in RÈ. 

(a) Write w = (1,3,8) as a linear combination of u and v. 

(b) Write w = (2,4,5) as a linear combination of u and v. 

(c) Find k so that w = (1,k,4) is a linear combination of u and v. 


(d) Find conditions on a, b, c so that w = (a, b,c) is a linear combination of u and v. 


Write the polynomial f(t) = at? + bt+c as a linear combination of the polynomials p; = (t — iy, 
P2 =t—1, p; = 1. [Thus, pı, po, p3 span the space P, (t) of polynomials of degree < 2.] 


Find one vector in R? that spans the intersection of U and W where U is the xy-plane—that is, 
U = {(a,b,0)}—and W is the space spanned by the vectors (1, 1, 1) and (1,2, 3). 


Prove that span(S) is the intersection of all subspaces of V containing S. 


Show that span(S) = span(S U {0}). That is, by joining or deleting the zero vector from a set, we do not 
change the space spanned by the set. 


Show that (a) If S C T, then span(S) C span(T). (b) span[span(S)] = span(S). 


Dependence and Linear Independence 
Determine whether the following vectors in R4 are linearly dependent or independent: 


(a) (1,2,-3,1), (3,7,1,-2), (1,3,7, -4); (b) (1,3,1,-—2), (2,5,-1,3), (1,3,7, —2). 


Determine whether the following polynomials u, v, w in P(t) are linearly dependent or independent: 
(a) u=P -4P 4+3t4+3, v=P +2 441-1, w=2P — P —3t+ 5; 
b) u=P -5P —2t4+3, v=P -4P —3t4+4, w=2P -17P —7t4 9. 


Show that the following functions f, g, h are linearly independent: 
@ f() =e, g(t)=sint, KA=; 0) f(t) =e, g(t) =e, A(t) =t 


Show that u = (a,b) and v = (c,d) in K? are linearly dependent if and only if ad — be = 0. 


Suppose u, v, w are linearly independent vectors. Prove that S is linearly independent where 


(a) S={u+tv—2w, u—v—w, u+w}; b) S={u+v—3w, u+3v—w, v+w}. 


Suppose {u,,...,U,,W,,-.-,W,} is a linearly independent subset of V. Show that 

span(u;) N span(w;) = {0} 
Suppose v;,U2,...,U, are linearly independent. Prove that S is linearly independent where 
(a) S= {a,v,,aV7,...,4,¥,} and each a; £ 0. 


(b) S= {v,--., Up 1, W, Usgts+-+,U,} and w = )>,b;v, and b, F 0. 


Suppose (a)1,---;@1n); (Go1,+++3@on)s +++) (Gmty+++;Gmn) are linearly independent vectors in K”, and 
suppose V, V2,..., V, are linearly independent vectors in a vector space V over K. Show that the following 
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vectors are also linearly independent: 


Wi = AY Te F Ann, Wi = 4210, +++ F an Vn, E Wm = Am Uy H:i F Ann Un 


Basis and Dimension 
4.97. Find a subset of u4, uz, uz, u4 that gives a basis for W = span(u;) of R°, where 
(a) u, = (1,1,1,2,3), u = (1,2, —1,—2, 1), uz = (3,5, —1,—2, 5), ug = (1,2,1, —1,4) 
(b) u = (1,—2,1,3,—1), u, = (—2,4,—2,—6,2), u= (1,—3,1,2,1), u4 = (3,-7,3,8,-1) 
(c) m = (1,0,1,0,1), vw =(1,1,2,1,0), m= (2,1,3,1,1), wy = (1,2,1,1,1) 
(d) u, = (1,0,1,1,1), u= (2,1,2,0,1), w= (1,1,2,3,4), uy = (4,2,5,4,6) 


4.98. Consider the subspaces U = {(a,b,c,d) : b — 2c +d = 0} and W = {(a,b,c,d) : a = d,b = 2c} of R*. 
Find a basis and the dimension of (a) U, (b) W, (c) UN W. 


4.99. Find a basis and the dimension of the solution space W of each of the following homogeneous systems: 


(a) x+2y—2z+2s— t=0 (b) x+2y— z+3s—4t=0 
x+2y— z+3s—2t=0 2x+4y—2z-— s+5t=0 
2x+4y—7z+ s+ t=0 2x + 4y — 2z+4s—2t=0 


4.100. Find a homogeneous system whose solution space is spanned by the following sets of three vectors: 


(a) (1, 2,0,3, 1), (2, 3,2,5, 3); (1, 2, 1.2; 2); 
(b) (1,1,2,1,1), (1,2, 1,4,3), (3,5,4,9, 7). 


4.101. Determine whether each of the following is a basis of the vector space P,,(t): 


a) fl, l4+¢, 14+t4+f, 14t+/ +P, ..., 14+t1+P +--+ 1427); 
? ? 


b) {1+6 t+, P4+P, ..., P+, rol ee. 


4.102. Find a basis and the dimension of the subspace W of P(t) spanned by 


(a) u=P4+2P—-2t+1, v=P43/—-314+4, w=2P +P —7t-7, 
b) u=P+P—-3t4+2, v=2P +P 4+t-4, w= 4P 43 —5t42. 


4.103. Find a basis and the dimension of the subspace W of V = M,, spanned by 
1 -5 1 1 2 —4 J| 1 -7 
E fh p= |} 1 DE I sail i 


Rank of a Matrix, Row and Column Spaces 


4.104. Find the rank of each of the following matrices: 


1 3 -2 5 4 1 2 -3 -2 1 1 2 
1 4 13 5 1 3 =2 0 4 5 5 
lia. oe ae OP le ep cole OR pa 
27 -3 6 13 2 1 -9 -10 -1 -2 2 
4.105. For k = 1,2,...,5, find the number n, of linearly independent subsets consisting of k columns for each of 
the following matrices: 
1 102 3 1 2 1 0 2 
(a) A=]1 2 0 2 5j, (b) B=|1 2 304 
13 02 7 1 15 0 6 
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121 3 1 6 122 12 1 
243 8 3 15 245 45 5 
4.106. Tee |] 232 53 up MPH lt og 4 a 6 
4 8 6 16 7 32 3 6 7 7 9 10 
For each matrix (where C}, ..., Ce denote its columns): 


(i) Find its row canonical form M. 

(ii) Find the columns that are linear combinations of preceding columns. 
(iii) Find columns (excluding C,) that form a basis for the column space. 
(iv) Express C, as a linear combination of the basis vectors obtained in (iii). 


4.107. Determine which of the following matrices have the same row space: 


1 -l 3 
1 —2 -1 1 -1 2 
=f} 23) oe dap cep ia 


4.108. Determine which of the following subspaces of R? are identical: 
U, = span{(1,1,-1), (2,3,-1), (3,1,-5)], U, = span[(1,—1,—3), (3,—2,—8), (2,1, -3)] 
U, = span[(1,1,1), (1,-1,3), (3,-1,7)] 


4.109. Determine which of the following subspaces of R* are identical: 
U, = span{(1,2,1,4), (2,4,1,5), (3,6,2,9)], U, = span[(1,2,1,2), (2,4, 1,3)], 
U, = span[(1,2,3,10), (2,4,3,11)] 


4.110. Find a basis for (i) the row space and (ii) the column space of each matrix M: 


00314 121 01 
13121 122 13 
@ M=]; 452p A M=j3 65 27 
412887 241-10 


4.111. Show that if any row is deleted from a matrix in echelon (respectively, row canonical) form, then the 
resulting matrix is still in echelon (respectively, row canonical) form. 


4.112. Let A and B be arbitrary m x n matrices. Show that rank(A + B) < rank(4) + rank(B). 
4.113. Let r = rank(A + B). Find 2 x 2 matrices A and B such that 

(a)r < rank(A), rank(B); (b) r = rank(A) = rank(B); (c)r > rank(A), rank(B). 
Sums, Direct Sums, Intersections 
4.114. Suppose U and W are two-dimensional subspaces of K?. Show that U N W # {0}. 


4.115. Suppose U and W are subspaces of V such that dim U = 4, dim W = 5, and dim V = 7. Find the possible 
dimensions of U N W. 


4.116. Let U and W be subspaces of R? for which dim U = 1, dim W = 2, and U W. Show that R=UOwW. 


4.117. Consider the following subspaces of R°: 
U = span[(1,—1,—-1,—2,0), (1,-2,—2,0,-3), (1,-1,—-2,-2,1)] 
W = span[(1, —2,—3,0,—2), (1,—1,-3,2,-4), (1,-1,—2,2,—5)] 
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(a) Find two homogeneous systems whose solution spaces are U and W, respectively. 
(b) Find a basis and the dimension of U N W. 


4.118. Let U,, U>, U3 be the following subspaces of R°: 
U, = {(a,b,c) a= eh U, = {(a,b,c):a+b+c= 0}, U; = {(0,0,c)} 
Show that (a) R? = U, + U, (b) R? = U, + U3, (c) R? = U, + Uz. When is the sum direct? 
4.119. Suppose U, W,, W, are subspaces of a vector space V. Show that 
(UN WI) + (UN M) CUN(W, + Wy) 
Find subspaces of R? for which equality does not hold. 
4.120. Suppose W,, W3, ..., W, are subspaces of a vector space V. Show that 
(a) span(W,,W,...,W,) =W,+Wy+---+W,. 
(b) IfS; spans W; for i= 1,...,r, then Si US, U---US, spans W, + Wy +---+ W, 
4.121. Suppose V = U & W. Show that dim V = dim U + dim W. 
4.122. Let S and T be arbitrary nonempty subsets (not necessarily subspaces) of a vector space V and let k be a 
scalar. The sum S + T and the scalar product AS are defined by 
S+T=(u+v:ueS, vET}, kS = {ku :u € S} 
[We also write w + S for {w} + S.] Let 
S={(1,2), (2,3)}, T = {(1,4), (1,5), (2,5)}, w= (1,1), k=3 
Find: (a) S+ 7, (b)w+S, (c) kS, (d) AT, (e) kS+kT, (£) k(S +T). 
4.123. Show that the above operations of S + T and kS satisfy 
(a) Commutative law: S+ T =T + 5S. 
(b) Associative law: (S1 + Sy) + S3 = S1 + (S2 + $3). 
(c) Distributive law: k(S + T) = kS + kT. 
(d) S+ {0} = {0} +S=SandS+V=V+S=VS. 
4.124. Let V be the vector space of n-square matrices. Let U be the subspace of upper triangular matrices, and let 
W be the subspace of lower triangular matrices. Find (a) UN W, (b) U + W. 
4.125. Let V be the external direct sum of vector spaces U and W over a field K. (See Problem 4.76.) Let 
U = {(u,0) : u € U} and W = {(0,w):w € W} 
Show that (a) U and W are subspaces of V, (b) V = Uo V. 
4.126. Suppose V = U+ W. Let V be the external direct sum of U and W. Show that V is isomorphic to V under 
the correspondence v = u + w e (u,w). 
4.127. Use induction to prove (a) Theorem 4.22, (b) Theorem 4.23. 
Coordinates 
4.128. The vectors u; = (1, —2) and u, = (4, —7) form a basis S of R?. Find the coordinate vector [v] of v relative 
to S where (a) v = (5,3), (b) v = (a,b). 
4.129. The vectors u; = (1,2,0), u, = (1,3,2), u, = (0, 1,3) form a basis S of R°. Find the coordinate vector [u] 


of v relative to S where (a) v = (2,7, —4), (b) v = (a,b,c). 
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4.130. 


4.131. 


4.132. 


4.133. 


S={P+r/, P+t, t+1, 1} isa basis of P(t). Find the coordinate vector [v] of v relative to S 
where (a) v = 28 + Ê — 4t +2, (b) v = ah + bP + ct4+d. 


Let V = M, ». Find the coordinate vector [A] of A relative to S where 
1 1 1 -l 1 1 1 0 3 -5 a b 
sti th G ah [oof la olj m @ asfi 5) o afe a 


Find the dimension and a basis of the subspace W of P; (t) spanned by 


u=fP +2 —3t+4, v= 2P 4+5P — 4t +7, w=P +4 +t++2 


Find the dimension and a basis of the subspace W of M = M, ; spanned by 


121 2 4 3 123 
sal; 1 T sel 5 n asi, 7 ‘| 


Miscellaneous Problems 


4.134. 


4.135. 


4.136. 


4.137. 


Answer true or false. If false, prove it with a counterexample. 

(a) If, u, uz span V, then dim V = 3. 

(b) If A isa 4x 8 matrix, then any six columns are linearly dependent. 

(c) If u, u, uz are linearly independent, then wu, uz, uz, w are linearly dependent. 
(d) If w,, uy, uz, u4 are linearly independent, then dim V > 4. 

(e) If, vu, uz span V, then w, uj, uy, uz span V. 


f) Ifu,, u, uz, u4 are linearly independent, then u4, u, u3 are linearly independent. 
2, U3, U4 y p 1» U2, U3 y p 


Answer true or false. If false, prove it with a counterexample. 


(a) If any column is deleted from a matrix in echelon form, then the resulting matrix is still in echelon 
form. 


(b) If any column is deleted from a matrix in row canonical form, then the resulting matrix is still in row 
canonical form. 


(c) Ifany column without a pivot is deleted from a matrix in row canonical form, then the resulting matrix 
is in row canonical form. 


Determine the dimension of the vector space W of the following n-square matrices: 


(a) symmetric matrices, (b) antisymmetric matrices, 

(d) diagonal matrices, (c) scalar matrices. 

Let t,,t,...,¢, be symbols, and let K be any field. Let V be the following set of expressions where a; € K: 
aih + ah +: + antn 


Define addition in V and scalar multiplication on V by 


(atı meet antn) H (biti eres bata) = (a F by )ty qe (anbnm)tn 
K(ayt, + Ant + aces + antn) = katı + kab + aan + kant, 
Show that V is a vector space over K with the above operations. Also, show that {t,,...,¢,} is a basis of V, 
where 
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[Some answers, such as bases, need not be unique.] 


4.71. (a) ŒE; = 26u — 22v; (b) The sum 7v + 8 is not defined, so E, is not defined; 
(c) E, = 23u + 5v; (d) Division by v is not defined, so E, is not defined. 


4.77. (a) Yes; ( 
(c) No; e.g., ( 
(e) No; e.g., ( 


b) No; e.g., (1,2,3) € W but —2(1,2,3) g W; 
1,0,0), (0, 1,0) € W, but not their sum; (d) Yes; 
1,1,1) € W, but 2(1,1,1) Z W (£) Yes 


4.79. The zero vector 0 is not a solution. 


4.83. (a) w= 3u — 19, (b) Impossible, (co) k= 4, (d) 7a—5b+c=0 
4.84. Using f = xp; + yp. + Zp3, we get x = a, y = 2a +b,z=a+b+c 

4.85. v= (2,1,0) 

4.89. (a) Dependent, (b) Independent 

4.90. (a) Independent, (b) Dependent 

4.97. (a) ui, uz, u4; (b) u, u, uz; (c) u, Uy, u4; (d) 14, u, u3 

4.98. (a) dimU =3, (b) dim W =2, (c) dim(UNW)=1 


4.99. (a) Basis: {(2,— 
(b) Basis: {(2, 


4.100. (a) 5x+y-z-s=0, x+y-—z-t=0; 
(b) 3x-y-—z=0, 2x-—3y+s=0, x-—2y4+t=0 


4.101. (a) Yes, (b) No, because dim P,,(¢) = n + 1, but the set contains only n elements. 
4.102. (a) dim W =2, (b) dimW=3 

4.103. dim W =2 

4.104. (a) 3, (b) 2, (c) 3 

4.105. (a) 1, =4, m=5, m, = n4 = n; =0; (b) ny =4, m=6, mņ=3, nņ=n;=0 


4.106. (a) (i) M=[1,2,0,1,0,3; 0,0,1,2,0,1; 0,0,0,0,1,2; 0); 
Gi): C, Cp Gy Gi) Ce G, Ce Gy) C =3C +G +20. 
(b) G) M=[1,2,0,0,3,1; 0,0,1,0,-1,-1; 0,0,0,1,1,2; 0]; 
Gi) G, C, Cg GH) CLG, Ce Gv) G= -G +20, 


0 7 


4.107. A and C are row equivalent to F 1 4 


i but not B 


4.108. U, and U, are row equivalent to E al but not U; 


1 
0 


0 
1 
2 0 1 
001 3}: but not u 
1) 


4.109. U, and U, are row equivalent to | : 
4.110. (a) (i) ( 1,- 
i 2 


2,1), (0,0,1,—1,—1), (0,0,0,4,7); (ii) Cy, Cy, Cy 
0, 1), (0, 0,1,1, iE (ii) Ci, C; 
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4.113. (a) ial ET E (b) 4=|) a= I 


4.115. dim(U N W) = 2, 3, or 4 


3x+4y—-z-t=0 
4x+2y+s=0 


4x+2y-s=0, 
+2y+z+t=0’ 


(b) Basis: {(1,—2,—5,0,0), (0,0,1,0,—1)}; dim(UN W) =2 


4.117. (a) (i) (i) 9 


4.118. The sum is direct in (b) and (c). 


4.119. In R’, let U, V, W be, respectively, the line y = x, the x-axis, the y-axis. 


4.122. (a) {(2,6), (2,7), (3,7), (3,8), (48) ©) {(2,3), (,4)} 
(c) {(3,6), (6,9)}; (d) {(3, 12), GIS) (6,15)}; 
(e and f) {(6, 18), (6,21), (9,21), (9, 24), (12, 24)} 


4.124. (a) Diagonal matrices, b) V 
4.128. (a) [—41,11], (b) [—7a—4b, 2a+b] 


4.129. (a) [-11,13,—10]), (b) [e-3b+7a, -c+3b—6a, c—2b+4a] 


4.130. (a) [2,-1,-2,2], (b) [a, b-c, c—b+a, d—ct+b—a] 


4.131. (a) [7,-1,-13,10], (b) [d, c—d, b+c—2d, a—b—2c+2d] 


4.132. dim W = 2; basis: {P +2° —3t+4, P+2t-1} 
4.133. dim W = 2; basis: {(1,2,1,3,1,2], [0,0,1,1,3,2]} 


4.134. (a) False; (1,1), (1,2), (2,1) span R’; (b) True; 
(c) False; (1,0, 0,0), (0, 1,0, 0), (0,0, 1,0), w = (0,0, 0, 1); 
(d) True; (e) True; (f£) True 


; (c) True 


4.135. (a) True; (b) False; e.g. delete C, from p : | 


012 
4.136. (a) 4n(n+1), (©) 4n(n-1), (©) n, (d) 1 


Linear Mappings 


5.1 Introduction 


The main subject matter of linear algebra is the study of linear mappings and their representation by 
means of matrices. This chapter introduces us to these linear maps and Chapter 6 shows how they can be 
represented by matrices. First, however, we begin with a study of mappings in general. 


5.2 Mappings, Functions 


Let A and B be arbitrary nonempty sets. Suppose to each element in a € A there is assigned a unique 
element of B; called the image of a. The collection f of such assignments is called a mapping (or map) 
from A into B, and it is denoted by 


f:A-B 


The set A is called the domain of the mapping, and B is called the target set. We write f(a), read ‘‘f of a,”’ 
for the unique element of B that f assigns to a € A. 

One may also view a mapping f : A — B as a computer that, for each input value a € A, produces a 
unique output f(a) € B. 


Remark: The term function is used synonymously with the word mapping, although some texts 
reserve the word “‘function’’ for a real-valued or complex-valued mapping. 


Consider a mapping f : A — B. If A’ is any subset of A, then f(A’) denotes the set of images of 
elements of 4’; and if B’ is any subset of B, then f~'(B’) denotes the set of elements of A, each of whose 
image lies in B. That is, 


f(4)={fla):ac A} and f-'(B)={acA: f(a) BY} 


We call f(A’) the image of A’ and f~!(B’) the inverse image or preimage of B’. In particular, the set of all 
images (i.e., f(A)) is called the image or range of f. 

To each mapping f : A — B there corresponds the subset of A x B given by {(a,f(a)):a€ A}. We 
call this set the graph of f. Two mappings f : A — B and g: A — B are defined to be equal, written 
f = z, if f(a) = g(a) for every a € A—that is, if they have the same graph. Thus, we do not distinguish 
between a function and its graph. The negation of f = g is written f # g and is the statement: 


There exists an a € A for which f(a) # g(a). 


Sometimes the ‘‘barred’’ arrow +> is used to denote the image of an arbitrary element x € A under a 
mapping f : A — B by writing 


xr f(x) 


This is illustrated in the following example. 


<> 
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EXAMPLE 5.1 

(a) Let: R— R be the function that assigns to each real number x its square x*. We can denote this function by 
writing 
fa) =x or å xex 


Here the image of —3 is 9, so we may write f(—3)=9. However, f~!(9) = {3,—3}. Also, 
F(R) = [0, 00) = {x: x > 0} is the image of f. 


(b) Let A = {a,b,c,d} and B = {x,y,z,t}. Then the following defines a mapping f : A — B: 
fla) =y, f(b) =x, fle) =z, f@)=y or f={(a,y), (b,x), (cz), (d,y)} 
The first defines the mapping explicitly, and the second defines the mapping by its graph. Here, 
f ({a, b, d}) z {f (a) f(b), f (d)} = {y, x,y} = {x,y} 
Furthermore, f(A) = {x,y,z} is the image of f. 
EXAMPLE 5.2 Let V be the vector space of polynomials over R, and let p(t) = 3 — 5t + 2. 
(a) The derivative defines a mapping D: V — V where, for any polynomials f(t), we have D( f) = df /dt. Thus, 
D(p) = D(3? — 5t+ 2) = 6r—5 


(b) The integral, say from 0 to 1, defines a mapping J: V — R. That is, for any polynomial f(t), 
1 


n= [soe and so Jip) = |G? - 5142) = 


0 


Observe that the mapping in (b) is from the vector space V into the scalar field R, whereas the mapping in (a) is from 
the vector space V into itself. 


Matrix Mappings 


Let A be any m x n matrix over K. Then A determines a mapping F; : K” — K” by 


F4(u) = Au 
where the vectors in K” and K” are written as columns. For example, suppose 
1 
1 —4 5 
a 3 E and u = 5 


then 


Remark: For notational convenience, we will frequently denote the mapping F4 by the letter A, the 
same symbol as used for the matrix. 


Composition of Mappings 

Consider two mappings f : A — B and g: B — C, illustrated below: 
PEN 2G 

The composition of f and g, denoted by gof, is the mapping g°f : A — C defined by 
(g°f)(a) =8(f(@) 
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That is, first we apply f to a € A, and then we apply g to f(a) € B to get g( f(a)) € C. Viewing f and g 
as ‘‘computers,’’ the composition means we first input a € A to get the output f(a) € B using f, and then 
we input f(a) to get the output g( f(a)) € C using g. 
Our first theorem tells us that the composition of mappings satisfies the associative law. 

THEOREM 5.1: Let f:4— B,g:B—C,h: C— D. Then 

ho (g°f) = (hog) of 

We prove this theorem here. Let a € A. Then 

(ho (gof))(a) = h((gef)(a)) = h(g(F(@))) 

((hog)of)(a) = (heg)( f(a)) = h(g( f (a))) 
Thus, (he (gef))(a) = ((hog)of)(a) for every a € A, and so ho (gof) = (hog)of. 


One-to-One and Onto Mappings 
We formally introduce some special types of mappings. 
DEFINITION: A mapping f : A — B is said to be one-to-one (or 1-1 or injective) if different elements 
of A have distinct images; that is, 
If f(a) = f(a’), then a =a’. 


DEFINITION: A mapping f : A — B is said to be onto (or f maps A onto B or surjective) if every b € B 
is the image of at least one a € A. 


DEFINITION: A mapping f : A — B is said to be a one-to-one correspondence between A and B (or 
bijective) if f is both one-to-one and onto. 
EXAMPLE 5.3 Let f:R— R, g:R—R, h:R—-R be defined by 
fà =x, g(x) ax =x h(x) =x 


The graphs of these functions are shown in Fig. 5-1. The function f is one-to-one. Geometrically, this means 
that each horizontal line does not contain more than one point of f. The function g is onto. Geometrically, 
this means that each horizontal line contains at least one point of g. The function A is neither one-to-one nor 
onto. For example, both 2 and —2 have the same image 4, and —16 has no preimage. 


f(x) = 27 g(x) = æ- g h(x) = g? 


Figure 5-1 


Identity and Inverse Mappings 


Let A be any nonempty set. The mapping f : A — A defined by f(a) = a—that is, the function that 
assigns to each element in A itself—is called identity mapping. It is usually denoted by 1, or 1 or Z. Thus, 
for any a € A, we have 1,(a) =a. 
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Now let f : A — B. We call g: B — A the inverse of f, written f~t, if 
feg=l; and gof=1; 
We emphasize that f has an inverse if and only iff is a one-to-one correspondence between A and B; that 


is, f is one-to-one and onto (Problem 5.7). Also, if b € B, then f~! (b) = a, where a is the unique element 
of A for which f(a) = b 


5.3 Linear Mappings (Linear Transformations) 


We begin with a definition. 

DEFINITION: Let V and U be vector spaces over the same field K. A mapping F : V — U is called a 
linear mapping or linear transformation if it satisfies the following two conditions: 
(1) For any vectors v,w € V, F(v + w) = F(v) + F(w). 
(2) For any scalar k and vector v € V, F(kv) = kF (v). 


Namely, F: V — U is linear if it “‘preserves’’ the two basic operations of a vector space, that of 
vector addition and that of scalar multiplication. 

Substituting k = 0 into condition (2), we obtain F(0) = 0. Thus, every linear mapping takes the zero 
vector into the zero vector. 

Now for any scalars a,b € K and any vector v,w € V, we obtain 


F(av + bw) = F(av) + F(bw) = aF(v) + bF(w) 


More generally, for any scalars a; E€ K and any vectors v; E€ V, we obtain the following basic property of 
linear mappings: 


F(av + av oe Pe oe anUm) = aF (v) + aF (v) are ae gl (Vm) 


Remark 1: A linear mapping F : V — U is completely characterized by the condition 
F(av + bw) = aF(v) + bF(w) (*) 
and so this condition is sometimes used as its defintion. 


Remark 2: The term linear transformation rather than linear mapping is frequently used for linear 
mappings of the form F : R” — R”. 


EXAMPLE 5.4 


(a) Let F: R? — R? be the ‘‘projection’’ mapping into the xy-plane; that is, F is the mapping defined by 
F (x,y,z) = (x,y,0). We show that F is linear. Let v = (a,b,c) and w = (a’,b’,c’). Then 


F(v+w)=F(a+d, b+b, c+c') =(a+d’, b+b', 0) 
= (a,b, 0) + (a',b',0) = F (v) + F(w) 
and, for any scalar k, 
F(kv) = F (ka, kb, kc) = (ka, kb, 0) = k(a, b, 0) = kF (v) 


Thus, F is linear. 


(b) Let G: R? — R° be the ‘‘translation’’ mapping defined by G(x, y) = (x+ 1, y + 2). [That is, G adds the vector 
(1,2) to any vector v = (x,y) in R*.] Note that 


G(0) = G(0,0) = (1,2) £0 


Thus, the zero vector is not mapped into the zero vector. Hence, G is not linear. 
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EXAMPLE 5.5 (Derivative and Integral Mappings) Consider the vector space V = P(t) of polynomials over the 
real field R. Let u(t) and v(t) be any polynomials in V and let k be any scalar. 
(a) Let D: V — V be the derivative mapping. One proves in calculus that 
d(u+v) du dv d(ku du 
( ) = | and (ku) =k— 
dt dt dt dt dt 


That is, D(u + v) = D(u) + D(v) and D(ku) = kD(u). Thus, the derivative mapping is linear. 


(b) Let J: V — R be an integral mapping, say 


One also proves in calculus that, 


[uo + v(t)]dt = | 


0 


1 


u(t) dt + | v(t) dt 


0 0 


and 
1 


| ku(t) dt = e| u(t) dt 


0 
That is, J(u + v) = J(u) + J(v) and J(ku) = kJ(u). Thus, the integral mapping is linear. 
EXAMPLE 5.6 (Zero and Identity Mappings) 


(a) Let F: V — U be the mapping that assigns the zero vector 0 € U to every vector v € V. Then, for any vectors 
v,w € V and any scalar k € K, we have 


F(v+w)=0=0+0 = F(v)+F(w) and F(kv)= 0 = k0 = kF (v) 
Thus, F is linear. We call F the zero mapping, and we usually denote it by 0. 


(b) Consider the identity mapping J: V — V, which maps each v € V into itself. Then, for any vectors v, w € V 
and any scalars a,b € K, we have 


I(av + bw) = av + bw = al (v) + bI (w) 
Thus, Z is linear. 


Our next theorem (proved in Problem 5.13) gives us an abundance of examples of linear mappings. In 
particular, it tells us that a linear mapping is completely determined by its values on the elements of a basis. 


THEOREM 5.2: Let V and U be vector spaces over a field K. Let {v,, v2, ..., U„} be a basis of V and 
let uj, U2,...,U„ be any vectors in U. Then there exists a unique linear mapping 
F: V — U such that F(v,) = u, F(v2) = uo,...,F(U,) = un- 


We emphasize that the vectors u,,uU5,...,u,, in Theorem 5.2 are completely arbitrary; they may be 
linearly dependent or they may even be equal to each other. 


Matrices as Linear Mappings 


Let A be any real m x n matrix. Recall that A determines a mapping Fy: K” — K” by F(u) = Au 
(where the vectors in K” and K” are written as columns). We show F’, is linear. By matrix multiplication, 


F,y(vu+w) = A(v + w) = Av + Aw = Fy (v) + Fy(w) 
F (kv) = A(kv) = k(Av) = kFy(v) 


In other words, using A to represent the mapping, we have 
A(v + w) =Au+Aw and A(kv) = k(Av) 


Thus, the matrix mapping 4 is linear. 
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Vector Space Isomorphism 


The notion of two vector spaces being isomorphic was defined in Chapter 4 when we investigated the 
coordinates of a vector relative to a basis. We now redefine this concept. 


DEFINITION: Two vector spaces V and U over K are isomorphic, written V = U, if there exists a 
bijective (one-to-one and onto) linear mapping F: V — U. The mapping F is then 
called an isomorphism between V and U. 


Consider any vector space V of dimension n and let S be any basis of V. Then the mapping 
v= [ols 


which maps each vector v € V into its coordinate vector [v|,, is an isomorphism between V and K”. 


5.4 Kernel and Image of a Linear Mapping 
We begin by defining two concepts. 


DEFINITION: Let F: V — U be a linear mapping. The kernel of F, written Ker F, is the set of 
elements in V that map into the zero vector 0 in U; that is, 
Ker F = {v E€ V: F(v) = 0} 
The image (or range) of F, written Im F, is the set of image points in U; that is, 
Im F = {u € U: there exists v € V for which F(v) = u} 
The following theorem is easily proved (Problem 5.22). 


THEOREM 5.3: Let F: V — U bea linear mapping. Then the kernel of F is a subspace of V and the 
image of F is a subspace of U. 


Now suppose that v,,v,...,U,, span a vector space V and that F : V — U is linear. We show that 
F(v,),F(v),-.-,F(v,) span Im F. Let u € Im F. Then there exists v € V such that F(v) = u. Because 
the v;’s span V and v € V, there exist scalars a,,a,...,a,, for which 


V = AV] F aV + +++ +4, Uy 
Therefore, 

u = F(v) = F(a, vy + a30 + +++ + On Um) = AF (0) + aF (0) +-+ + aF (Um) 
Thus, the vectors F(v,),F(v2),...,F(v,,) span Im F. 


We formally state the above result. 


PROPOSITION 5.4: Suppose v),U2,...,U,, Span a vector space V, and suppose F: V — U is linear. 
Then F(v,),F(v2),...,(v») span Im F. 
EXAMPLE 5.7 
(a) Let F: R? — R? be the projection of a vector v into the xy-plane [as pictured in Fig. 5-2(a)]; that is, 
F (x,y,z) = (x,y, 0) 


Clearly the image of F is the entire xy-plane—that is, points of the form (x,y, 0). Moreover, the kernel of F is 
the z-axis—that is, points of the form (0,0, c). That is, 


Im F = {(a,b,c): c = 0} = xy-plane and KerF = {(a,b,c):a=0,b=0} = z-axis 
(b) Let G: R? — R? be the linear mapping that rotates a vector v about the z-axis through an angle 9 [as pictured in 
Fig. 5-2(b)]; that is, 
G(x,y,z) = (xcos 0 — ysin 0, xsin + ycosð, z) 
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v= (a, b, c) 


F(v) = (a, b, 0) y 


x 


@) i 


Figure 5-2 


Observe that the distance of a vector v from the origin O does not change under the rotation, and so only the zero 
vector 0 is mapped into the zero vector 0. Thus, Ker G = {0}. On the other hand, every vector u in R? is the image 
of a vector v in R? that can be obtained by rotating u back by an angle of 0. Thus, Im G = R’, the entire space. 


EXAMPLE 5.8 Consider the vector space V = P(t) of polynomials over the real field R, and let H: V — V be the 
third-derivative operator; that is, H[f (t)] = d°f/dt?. [Sometimes the notation D? is used for H, where D is the 
derivative operator.] We claim that 


Ker H = {polynomials of degree < 2} = P,(f) and $ImH=V 


The first comes from the fact that H(at? + bt +c) = 0 but H(t") 4 0 for n > 3. The second comes from that fact 
that every polynomial g(t) in V is the third derivative of some polynomial f (t) (which can be obtained by taking the 
antiderivative of g(t) three times). 


Kernel and Image of Matrix Mappings 


Consider, say, a 3 x 4 matrix A and the usual basis {e,,e,e3,e,} of K* (written as columns): 


de as dln À 1 1 1 1 
1 2 3 4 

0 0 0 0 

A= |b; b, b} by}, e = ol’ e, = ol? e3 = ol? e4 = 0 

a a2 B e 0 0 0 0 


Recall that 4 may be viewed as a linear mapping A: K* — K?, where the vectors in K* and K? are 
viewed as column vectors. Now the usual basis vectors span K“, so their images Ae,, Aen, Aez, Ae, span 
the image of A. But the vectors Ae,, Ae), Aez, Ae, are precisely the columns of A: 

T T T T 
Fa Ae; = [az, b3, c3] , Ae, = [a4, b4, c4] 


Ae, = [a;, b1, c1 ’ Ae, = laz, by, c2 


Thus, the image of A is precisely the column space of A. 


On the other hand, the kernel of A consists of all vectors v for which Av = 0. This means that the 
kernel of A is the solution space of the homogeneous system AX = 0, called the null space of A. 


We state the above results formally. 


PROPOSITION 5.5: — Let A be any m x n matrix over a field K viewed as a linear map A: K” — K”. Then 
Ker A = nullsp(A) and Im A = colsp(A) 
Here colsp(4) denotes the column space of A, and nullsp(A4) denotes the null space of A. 
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Rank and Nullity of a Linear Mapping 


Let F: V — U be a linear mapping. The rank of F is defined to be the dimension of its image, and the 
nullity of F is defined to be the dimension of its kernel; namely, 


rank(F’) = dim(Im F) and nullity(F) = dim(Ker F) 
The following important theorem (proved in Problem 5.23) holds. 


THEOREM 5.6 Let V be of finite dimension, and let F : V — U be linear. Then 
dim V = dim(Ker F) + dim(Im F) = nullity(F) + rank(F) 
Recall that the rank of a matrix A was also defined to be the dimension of its column space and row 


space. If we now view A as a linear mapping, then both definitions correspond, because the image of A is 
precisely its column space. 


EXAMPLE 5.9 Let F : Rf — R? be the linear mapping defined by 
F(x,y,z,t)=(x-y+z+t, 2x—2y+3z4+4t, 3x-—3y4+4z+4 5t) 
(a) Find a basis and the dimension of the image of F 
First find the image of the usual basis vectors of R, 
F(1,0,0,0) = (1, 2,3), F (0,0, 1,0) = (1,3, 4) 
F(0,1,0,0) = (—1, —2, —3), F(0,0,0, 1) = (1,4, 5) 


By Proposition 5.4, the image vectors span Im F. Hence, form the matrix M whose rows are these image vectors 
and row reduce to echelon form: 


t 2 3 1 2 3 i 2 3 
-1 =} -3 00 0 011 
MO a ae a a t Fl looa 
1 4 5 02 2 00 0 


Thus, (1,2,3) and (0, 1,1) form a basis of Im F. Hence, dim(Im F) = 2 and rank(F) = 2. 


(b) Find a basis and the dimension of the kernel of the map F 
Set F(v) = 0, where v = (x,y,z, t), 


F(x,y,z,t)=(x-y+z+t, 2x—2y+3z+4t, 3x-— 3y + 4z+ 5t) = (0,0,0) 


Set corresponding components equal to each other to form the following homogeneous system whose solution 
space is Ker F: 


x— y+ z+ t=0 x—y+z+ t=0 _ Sis afte 
2x — 2y+3z+4t=0 or z+2t=0 or x E 
3x — 3y +4z+5t=0 z+2t=0 7 


The free variables are y and t. Hence, dim(Ker F) = 2 or nullity(F’) = 2. 
(i) Set y = 1, t = 0 to obtain the solution (—1, 1,0,0), 
(ii) Set y = 0, ¢ = 1 to obtain the solution (1,0, —2, 1). 
Thus, (—1, 1,0,0) and (1,0, —2,1) form a basis for Ker F 
As expected from Theorem 5.6, dim(Im F) + dim(Ker F) = 4 = dim Rê. 


Application to Systems of Linear Equations 


Let AX = B denote the matrix form of a system of m linear equations in n unknowns. Now the matrix A 
may be viewed as a linear mapping 


A: K” — K” 
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Thus, the solution of the equation AX = B may be viewed as the preimage of the vector B € K” under the 
linear mapping A. Furthermore, the solution of the associated homogeneous system 


AX =0 
may be viewed as the kernel of the linear mapping A. Applying Theorem 5.6 to this homogeneous system 
yields 

dim(Ker A) = dim K” — dim(Im A) = n — rank A 


But n is exactly the number of unknowns in the homogeneous system 4X = 0. Thus, we have proved the 
following theorem of Chapter 4. 


THEOREM 4.19: The dimension of the solution space W of a homogenous system AX = 0 of linear 
equations is s = n — r, where n is the number of unknowns and r is the rank of the 
coefficient matrix A. 


Observe that r is also the number of pivot variables in an echelon form of AX = 0, so s = n — ris also 
the number of free variables. Furthermore, the s solution vectors of AX = 0 described in Theorem 3.14 
are linearly independent (Problem 4.52). Accordingly, because dim W = s, they form a basis for the 
solution space W. Thus, we have also proved Theorem 3.14. 


5.5 Singular and Nonsingular Linear Mappings, Isomorphisms 


Let F: V — U bea linear mapping. Recall that F(0) = 0. F is said to be singular if the image of some 
nonzero vector v is 0—that is, if there exists v Æ 0 such that F(v) = 0. Thus, F : V — U is nonsingular if 
the zero vector 0 is the only vector whose image under F is 0 or, in other words, if Ker F = {0}. 


EXAMPLE 5.10 Consider the projection map F: R? — R? and the rotation map G: R? — R? appearing in 
Fig. 5-2. (See Example 5.7.) Because the kernel of F is the z-axis, F is singular. On the other hand, the kernel of G 
consists only of the zero vector 0. Thus, G is nonsingular. 


Nonsingular linear mappings may also be characterized as those mappings that carry independent sets 
into independent sets. Specifically, we prove (Problem 5.28) the following theorem. 


THEOREM 5.7: Let fF: V — U be a nonsingular linear mapping. Then the image of any linearly 
independent set is linearly independent. 


Isomorphisms 


Suppose a linear mapping F : V — U is one-to-one. Then only 0 € V can map into 0 € U, and so F is 
nonsingular. The converse is also true. For suppose F is nonsingular and F(v) = F(w), then 
F(v—w) = F(v) — F(w) = 0, and hence, v- w= 0 or v = w. Thus, F(v) = F(w) implies v = w— 
that is, F is one-to-one. We have proved the following proposition. 


PROPOSITION 5.8: A linear mapping F : V — U is one-to-one if and only if F is nonsingular. 


Recall that a mapping F : V — U is called an isomorphism if F is linear and if F is bijective (i.e., if F 
is one-to-one and onto). Also, recall that a vector space V is said to be isomorphic to a vector space U, 
written V = U, if there is an isomorphism F : V — U. 

The following theorem (proved in Problem 5.29) applies. 


THEOREM 5.9: Suppose V has finite dimension and dim V = dim U. Suppose F: V — U is linear. 
Then F is an isomorphism if and only if F is nonsingular. 
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5.6 Operations with Linear Mappings 


We are able to combine linear mappings in various ways to obtain new linear mappings. These operations 
are very important and will be used throughout the text. 

Let F: V — U and G: V — U be linear mappings over a field K. The sum F + G and the scalar 
product kF, where k € K, are defined to be the following mappings from V into U: 


(F + G)(v) = F(v) + G(v) and (KF) (v) = kF(v) 


We now show that if F and G are linear, then F + G and KF are also linear. Specifically, for any vectors 
v,w E€ V and any scalars a,b € K, 


(F + G) (av + bw) = F(av + bw) + G(av + bw) 
= aF (v) + bF(w) + aG(v) + bG(w) 


= p G(v) eee )] 
=a(F + G) (v) + BF + G)(w) 
and (kF)(av + bw) = Kan + bw) = k|aF c + bF(w)| 


= akF (v) + bkF(w) = a(kF)(v) + b(kKF)(w) 


Thus, F + G and KF are linear. 
The following theorem holds. 


THEOREM 5.10: Let V and U be vector spaces over a field K. Then the collection of all linear 
mappings from V into U with the above operations of addition and scalar multi- 
plication forms a vector space over K. 


The vector space of linear mappings in Theorem 5.10 is usually denoted by 
Hom(V, U) 


Here Hom comes from the word ‘‘homomorphism.’’ We emphasize that the proof of Theorem 5.10 
reduces to showing that Hom(V, U) does satisfy the eight axioms of a vector space. The zero element of 
Hom(V, U) is the zero mapping from V into U, denoted by 0 and defined by 


0(v) =0 


for every vector v € V. 
Suppose V and U are of finite dimension. Then we have the following theorem. 


THEOREM 5.11: Suppose dim V = m and dim U = n. Then dim{Hom(V, U)| = mn. 


Composition of Linear Mappings 


Now suppose V, U, and W are vector spaces over the same field K, and suppose F: V — U and 
G: U — W are linear mappings. We picture these mappings as follows: 


FAUS 
Recall that the composition function GoF is the mapping from V into W defined by 


(GoF)(v) = G(F(v)). We show that Go F is linear whenever F and G are linear. Specifically, for 
any vectors v,w € V and any scalars a,b € K, we have 


(GoF)(av+ bw) = G(F(av + bw)) = G(aF (v) + bF(w)) 
= aG(F(v)) + bG(F(w)) = a(G° F)(v) + b(G° F)(w) 
Thus, GoF is linear. 


The composition of linear mappings and the operations of addition and scalar multiplication are 
related as follows. 
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THEOREM 5.12: Let V, U, W be vector spaces over K. Suppose the following mappings are linear: 
F:V =U, F':V =U and G:U—W, G:U>W 
Then, for any scalar k € K: 
(i) Go(F+F")=GoF+GoF". 
(ii) (G+G))oF=GoF+G' oF, 
(iti) k(GoF) = (kG)oF = Go (kF). 


5.7 Algebra A(V) of Linear Operators 


Let V be a vector space over a field K. This section considers the special case of linear mappings from the 
vector space V into itself—that is, linear mappings of the form F : V — V. They are also called linear 
operators or linear transformations on V. We will write A(V), instead of Hom(V, V), for the space of all 
such mappings. 

Now A(V) is a vector space over K (Theorem 5.8), and, if dim V = n, then dim A(V) = n?. Moreover, 
for any mappings F, G € A(V), the composition Go F exists and also belongs to A(V). Thus, we have a 
‘‘multiplication’’ defined in A(V). [We sometimes write FG instead of Go F in the space A(V).] 


Remark: An algebra A over a field K is a vector space over K in which an operation of 

multiplication is defined satisfying, for every F, G, H € A and every k € K: 

(i) F(G+H)=FG+ FH, 

(ii) (G+H)F = GF + HF, 
(iii) k(GF) = (kG)F = G(kF). 
The algebra is said to be associative if, in addition, (FG)H = F(GH). 

The above definition of an algebra and previous theorems give us the following result. 
THEOREM 5.13: Let V be a vector space over K. Then A(V) is an associative algebra over K with 
respect to composition of mappings. If dim V = n, then dim A(V) = n?. 


This is why A(V) is called the algebra of linear operators on V. 


Polynomials and Linear Operators 


Observe that the identity mapping J: V — V belongs to A(V). Also, for any linear operator F in A(V), 
we have FI = IF = F. We can also form ‘‘powers’’ of F. Namely, we define 


F? =], FP=FoF, P=FPoF=FoFoF, Fi = FoF 


Furthermore, for any polynomial p(t) over K, say, 


P(t) = ay + ayt+ ag? + Ha. 
we can form the linear operator p(F) defined by 
P(F) = aol + a) F + a)F? +- aF" 


(For any scalar k, the operator kI is sometimes denoted simply by k.) In particular, we say F is a zero of 
the polynomial p(t) if p(F) = 0. 
EXAMPLE 5.11 Let F: K? — K? be defined by F(x,y,z) = (0,x,y). For any (a,b,c) € K?, 
(F + 1)(a,b,c) = (0,a,b) + (a,b,c) = (a, a+b, b+c) 
F? (a,b,c) = F*(0,a,b) = F(0,0,a) = (0,0, 0) 


Thus, F? = 0, the zero mapping in A(V). This means F is a zero of the polynomial p(t) = Ë. 
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Square Matrices as Linear Operators 


Let M = M, „ be the vector space of all square n x n matrices over K. Then any matrix A in M defines a 
linear mapping F4 : K” — K” by F,(u) = Au (where the vectors in K” are written as columns). Because the 
mapping is from K” into itself, the square matrix A is a linear operator, not simply a linear mapping. 

Suppose A and B are matrices in M. Then the matrix product AB is defined. Furthermore, for any 
(column) vector u in K”, 


Fıg(u) = (AB)u = A(Bu) = A(Fg(U)) = F4 (Fg(u)) = (F4 ° Fp) (u) 


In other words, the matrix product AB corresponds to the composition of A and B as linear mappings. 
Similarly, the matrix sum A + B corresponds to the sum of A and B as linear mappings, and the scalar 
product kA corresponds to the scalar product of A as a linear mapping. 


Invertible Operators in A(V) 


Let F: V — V be a linear operator. F is said to be invertible if it has an inverse—that is, if there exists 
F`! in A(V) such that FF~' = F~'F = I. On the other hand, F is invertible as a mapping if F is both 
one-to-one and onto. In such a case, F~! is also linear and F`! is the inverse of F as a linear operator 
(proved in Problem 5.15). 

Suppose F is invertible. Then only 0 € V can map into itself, and so F is nonsingular. The converse is 
not true, as seen by the following example. 


EXAMPLE 5.12 Let V = P(t), the vector space of polynomials over K. Let F be the mapping on V that increases 
by 1 the exponent of ¢ in each term of a polynomial; that is, 


F(a +at t+ at 2 at) = at 24 


af hor ites 


Then F is a linear mapping and F is nonsingular. However, F is not onto, and so F is not invertible. 


The vector space V = P(t) in the above example has infinite dimension. The situation changes 
significantly when V has finite dimension. Namely, the following theorem applies. 


THEOREM 5.14: Let F be a linear operator on a finite-dimensional vector space V. Then the following 
four conditions are equivalent. 
(i) F is nonsingular: Ker F = {0}. (iii) F is an onto mapping. 
(ii) F is one-to-one. (iv) F is invertible. 
The proof of the above theorem mainly follows from Theorem 5.6, which tells us that 
dim V = dim(Ker F) + dim(Im F) 
By Proposition 5.8, (i) and (ii) are equivalent. Note that (iv) is equivalent to (ii) and (iii). Thus, to prove 
the theorem, we need only show that (i) and (iii) are equivalent. This we do below. 
(a) Suppose (i) holds. Then dim(Ker F) = 0, and so the above equation tells us that dim V = dim(Im F). 
This means V = Im F or, in other words, F is an onto mapping. Thus, (i) implies (iii). 
(b) Suppose (iii) holds. Then V = Im F, and so dim V = dim(Im F). Therefore, the above equation 
tells us that dim(Ker F) = 0, and so F is nonsingular. Therefore, (iii) implies (i). 


Accordingly, all four conditions are equivalent. 


Remark: Suppose A is a square n x n matrix over K. Then A may be viewed as a linear operator on 
K”. Because K” has finite dimension, Theorem 5.14 holds for the square matrix A. This is why the terms 
‘‘nonsingular’’ and ‘“‘invertible’’ are used interchangeably when applied to square matrices. 


EXAMPLE 5.13 Let F be the linear operator on R? defined by F(x,y) = (2x+y, 3x + 2y). 


(a) To show that F is invertible, we need only show that F is nonsingular. Set F(x,y) = (0,0) to obtain the 
homogeneous system 


2x+y=0 and 3x + 2y = 0 
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Solve for x and y to get x = 0, y = 0. Hence, F is nonsingular and so invertible. 
(b) To find a formula for F—', we set F(x,y) = (s, t) and so F~! (s, t) = (x,y). We have 
2x+ y=s 


(2x +y, 3x + 2y) = (s,t) or 
3x+2y =t 


Solve for x and y in terms of s and ¢ to obtain x = 2s — t, y = —3s + 2t. Thus, 
F` (s,t) = (2s — t, —3s + 2t) or F !(x,y) = (2x— y, —3x + 2y) 


where we rewrite the formula for F7! using x and y instead of s and t. 


SOLVED PROBLEMS 
Mappings 
5.1. State whether each diagram in Fig. 5-3 defines a mapping from A = {a,b,c} into B = {x,y,z}. 


(a) No. There is nothing assigned to the element b € A. 
(b) No. Two elements, x and z, are assigned to c € A. 


(c) Yes. 
(a) (b) (c) 
Figure 5-3 
5.2. Let f:A— Band g: B — C be defined by Fig. 5-4. 
(a) Find the composition mapping (g°f): 4A — C. 
(b) Find the images of the mappings f, g, gof. 
A f B I C 

Figure 5-4 


(a) Use the definition of the composition mapping to compute 


(gef) (a) = g(f(a)) = 80) =t, (gof) (6) =g(fb)) = g(x) =s 
(sef) (ec) =8(f(¢)) =a) =t 
Observe that we arrive at the same answer if we ‘‘follow the arrows” in Fig. 5-4: 


a>y—>t, box-s, cyt 


(b) By Fig. 5-4, the image values under the mapping f are x and y, and the image values under g are r, s, t. 
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5.3. 


5.4. 


5.5. 


Hence, 
Imf = {x,y} and Img = {r,s,t} 


Also, by part (a), the image values under the composition mapping gof are t and s; accordingly, 
Im gof = {s,t}. Note that the images of g and gof are different. 


Consider the mapping F : R? — R° defined by F(x,y,z) = (vz,x*). Find 
(a) F(2,3,4); (b) F(5,—2,7); (c) F~!(0,0), that is, all v € R? such that F(v) = 0. 
(a) Substitute in the formula for F to get F(2,3,4) = (3 - 4,27) = (12,4). 
(b) F(5,-2,7) = (—2- 7,57) = (—14, 25). 
(c) Set F(v) = 0, where v = (x,y,z), and then solve for x, y, z: 
F(x,y,z) = (yz,x") = (0,0) or yz =0,x7 =0 


Thus, x = 0 and either y = 0 or z = 0. In other words, x = 0, y = 0 or x = 0,z = 0—that is, the z-axis 
and the y-axis. 


Consider the mapping F : R? — R? defined by F(x,y) = (3y, 2x). Let S be the unit circle in R’, 
that is, the solution set of x? +y? = 1. (a) Describe F(S). (b) Find F~! (S). 


(a) Let (a,b) be an element of F(S). Then there exists (x,y) € S such that F(x,y) = (a,b). Hence, 


(3y, 2x) = (a,b) or 3y = a,2x =b or y=5x= 


Because (x,y) € S—that is, x? + y? = 1—we have 


b\* a? l 2 b 
= 2j = E Eee | 
(5) a (5) vr gtz 
Thus, /'(S) is an ellipse. 


(b) Let F(x,y) = (a,b), where (a,b) € S. Then (3y, 2x) = (a, b) or 3y = a, 2x = b. Because (a,b) € S, we 
have a? + b? = 1. Thus, (3y)” + (2x)? = 1. Accordingly, F~! (S) is the ellipse 4x2 + 9y? = 1. 


Let the mappings f : A — B, g : B — C, h: C — D be defined by Fig. 5-5. Determine whether or 
not each function is (a) one-to-one; (b) onto; (c) invertible (i.e., has an inverse). 


(a) The mapping f :A — B is one-to-one, as each element of A has a different image. The mapping 
g: B — C is not one-to one, because x and z both have the same image 4. The mapping h: C — D is 
one-to-one. 

(b) The mapping f : A — B is not onto, because z € B is not the image of any element of 4. The mapping 
g: B — C is onto, as each element of C is the image of some element of B. The mapping h: C — D is 
also onto. 


(c) A mapping has an inverse if and only if it is one-to-one and onto. Hence, only / has an inverse. 


Figure 5-5 
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5.6. 


5.7. 


5.8. 


Suppose f : A — B and g : B > C. Hence, (gof): A — C exists. Prove 
(a) Iff and g are one-to-one, then g0 f is one-to-one. 

(b) Iff and g are onto mappings, then g°f is an onto mapping. 

(c) If gof is one-to-one, then f is one-to-one. 

(d) If gof is an onto mapping, then g is an onto mapping. 


(a) Suppose (g° f)(x) = (gef)(y). Then g(f(x)) =g(f(y)). Because g is one-to-one, f(x) = f(y). 
Because f is one-to-one, x = y. We have proven that (gof)(x) = (gof)(v) implies x = y; hence gof 
is one-to-one. 

(b) Suppose c € C. Because g is onto, there exists b € B for which g(b) = c. Because f is onto, there exists 
a € A for which f(a) = b. Thus, (g°f)(a) = g( f(a)) = g(b) = c. Hence, gof is onto. 

(c) Suppose f is not one-to-one. Then there exist distinct elements x,y € A for which f(x) = f (y). Thus, 
(gof)(x) = g(f(x)) = 2(f()) = (gef)(y). Hence, gof is not one-to-one. Therefore, if gof is one-to- 
one, then f must be one-to-one. 

(d) Ifa € A, then (gof)(a) = g( f(a)) € g(B). Hence, (g°f)(A) C g(B). Suppose g is not onto. Then g(B) 
is properly contained in C and so (gof)(A) is properly contained in C; thus, gof is not onto. 
Accordingly, if gof is onto, then g must be onto. 


Prove that f : A — B has an inverse if and only if f is one-to-one and onto. 


Suppose f has an inverse—that is, there exists a function f~! : B — A for which f~! of = 1, and 
fof! = 1g. Because 1, is one-to-one, f is one-to-one by Problem 5.6(c), and because 1, is onto, f is onto 
by Problem 5.6(d); that is, f is both one-to-one and onto. 

Now suppose f is both one-to-one and onto. Then each b € B is the image of a unique element in A, say 
b*. Thus, if f(a) = b, then a = b*; hence, f (b*) = b. Now let g denote the mapping from B to A defined by 
bt b*. We have 


(i) (g°f)(a) =2(f(@) 
(ii) (f°g)(b) =f(g(b)) =f(b*) = b for every b € B; hence, fog = 1,. 


Accordingly, f has an inverse. Its inverse is the mapping g. 


g(b) = b* =a for every a € A; hence, gof = 14. 


Let f : R — R be defined by f(x) = 2x — 3. Now is one-to-one and onto; hence, f has an inverse 
mapping f—!. Find a formula for f—!. 

Let y be the image of x under the mapping /; that is, y = f(x) = 2x — 3. Hence, x will be the image of y 
under the inverse mapping f—!. Thus, solve for x in terms of y in the above equation to obtain x = $(y + 3). 
Then the formula defining the inverse function is f7! (y) = 4 (y + 3), or, using x instead of y, f7! (x) = 5 (x + 3). 


Linear Mappings 


5.9. 


Suppose the mapping F : R? — R? is defined by F(x,y) = (x +y, x). Show that F is linear. 


We need to show that F(v + w) = F(v) + F(w) and F(kv) = kF (v), where u and v are any elements of 
R? and & is any scalar. Let v = (a,b) and w = (a’,b’). Then 


vtw=(atda, b+) and = ku = (ka, kb) 
We have F(v) = (a+ b,a) and F(w) = (a' + 0’, a’). Thus, 
F(u+w)=F(at+d, b+) =(at+d+b+b', a+d) 
= (a+b, a)+ (d +b', a) = F(v) + Fw) 


and 
F (kv) = F(ka, kb) = (ka + kb, ka) = k(a +b, a) =kF(v) 


Because v, w, k were arbitrary, F is linear. 
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5.10. 


5.11. 


5.12. 


5.13. 


Suppose F : R? — R? is defined by F(x,y,z) = (x +y +z, 2x —3y + 4z). Show that F is linear. 


We argue via matrices. Writing vectors as columns, the mapping F may be written in the form 
F(v) = Av, where v = [x,y,z]’ and 
1 1 1 
a= k —3 a 
Then, using properties of matrices, we have 


F(v +w) = 4A(v + w) = Av + Aw = F(v) + F(w) 


and F(kv) = A(kv) = k(Av) = kF (v) 
Thus, F is linear. 
Show that the following mappings are not linear: 
(a) F:R?— R? defined by F(x,y) = (xy, x) 
(b) F:R?— R? defined by F(x,y) = (x +3, 2y, x+y) 
(c) F:R? — R? defined by F(x,y,z) = (|x|, y +z) 
(a) Let v= (1,2) and w = (3,4); then v+ w = (4,6). Also, 
F(v) = (1(2), 1) = (2,1) and F(w) = (3(4),3) = (12,3) 
Hence, 
F(v + w) = (4(6),4) = (24,6) 4 F(v) + F(w) 
(b) Because F'(0,0) = (3,0,0) 4 (0,0,0), F cannot be linear. 
(c) Let v= (1,2,3) and k = —3. Then kv = (—3, —6, —9). We have 
F(v)= (1,5) and kF(v) = —3(1,5) = (—3, -15). 
Thus, 
F (kv) = F(—3, —6, —9) = (3, —15) Æ kF (v) 
Accordingly, F is not linear. 


Let V be the vector space of n-square real matrices. Let M be an arbitrary but fixed matrix in V. 
Let F: V — V be defined by F(4) = AM + MA, where A is any matrix in V. Show that F is 
linear. 


For any matrices A and B in V and any scalar k, we have 


F(A +B) = (A+ B)M + M(A + B) = AM + BM + MA + MB 
= (AM + MA) = (BM + MB) = F(A) + F(B) 


and 
F(kA) = (kA)M + M(kA) = k(AM) + k(MA) = k(AM + MA) = kF (A) 
Thus, F is linear. 


Prove Theorem 5.2: Let V and U be vector spaces over a field K. Let {v,, v2, . . . , v, } be a basis of 
V and let u; , u2, ..., u, be any vectors in U. Then there exists a unique linear mapping F : V — U 
such that F(v,) = u1, F(v3) = uz, ..., F(U) = uy. 


There are three steps to the proof of the theorem: (1) Define the mapping F: V — U such that 
F(v;) = u, i= 1,...,n. (2) Show that F is linear. (3) Show that F is unique. 


Step 1. Let v€ V. Because {v,,...,,} is a basis of V, there exist unique scalars a,,...,a, € K for 
which v = ajv; + au +-+- +4,0,. We define F: V — U by 


F(v) = au] t auz Peta Anun 
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(Because the a; are unique, the mapping F is well defined.) Now, for i= 1,...,n, 
v; = Ou, +---+ lu; t+-+-+ 0, 
Hence, 
F(v;) = Ou, +--+ + lu; +--+ + Ou, = u; 


Thus, the first step of the proof is complete. 
Step 2. Suppose v = ajv; + a20 + +--+ a, U, and w = biv + bav ++- + b uv, Then 


v+ w= (a, +b), + (a + by) 0p +- + (an + bp) Un 


and, for any k € K, kv = kav, + kapv + -++ + ka,v,. By definition of the mapping F, 
F(v) = aju; + azu +--+ + a, V, and  F(w) = biu, + bzu +--+: + b,u, 


Hence, 
F(v+ w) = (a; + bi)u;, + (a, + baju +-+- + (a, + bp)ttn 
= (ayu } auy } } apun) y (biu } bauz } y b,Un) 
= F(v) + F(w) 
and 


F(kv) = K(ayuy + duy ee apin) = KF (v) 

Thus, F is linear. 

Step 3. Suppose G: V — U is linear and G(v,) = u;,i = 1,...,n. Let 
V = 4,0, + An, +--+ + 4,0, 
Then 
G(v) = G(a v + ayy + +++ + ann) = a, Gv) + ayG(vy) + +++ + a,G(v,) 
= AU, + azt +--+ + a,u, = F(v) 
Because G(v) = F(v) for every v € V,G = F. Thus, F is unique and the theorem is proved. 


5.14. Let F : R? — R? be the linear mapping for which F(1,2) = (2,3) and F(0, 1) = (1,4). [Note that 
{(1,2), (0, 1)} is a basis of R’, so such a linear map F exists and is unique by Theorem 5.2.] Find 
a formula for F; that is, find F(a, b). 


Write (a,b) as a linear combination of (1,2) and (0, 1) using unknowns x and y, 
(a,b) = x(1,2) + y(0, 1) = (x, 2x + y), so a=x, b=2x+y 
Solve for x and y in terms of a and b to get x=a, y=-—2a+b. Then 
F(a,b) = xF (1,2) +yF(0,1) = a(2,3) + (—2a + b)(1,4) = (b, —5a + 4b) 


5.15. Suppose a linear mapping F: V — U is one-to-one and onto. Show that the inverse mapping 
F-!: U — V is also linear. 


Suppose u,u' € U. Because F is one-to-one and onto, there exist unique vectors v, v’ € V for which 
F(v) =u and F(v') = uw’. Because F is linear, we also have 


F(vtv)=F(v)+F(v’) =u+u and F(kv) = kF(v) = ku 
By definition of the inverse mapping, 
F(u) =v, F(u) =, F'utw)=vtd, F (ku) = kv. 
Then 
F`! (u +u) = v+ v = F! (u) + F~! (u) and F~! (ku) = kv = kF`' (u) 


Thus, F~! is linear. 
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Kernel and Image of Linear Mappings 
5.16. Let F : Rf — R? be the linear mapping defined by 


F(x,y,z,t)=(x-y+z+t, x4+2z-t, x+y+4+3z- 3b?) 
Find a basis and the dimension of (a) the image of F, (b) the kernel of F. 
(a) Find the images of the usual basis of Ri: 
F(1,0,0,0) = (1,1,1), F(0,0,1,0) = (1,2,3) 
F(0,1,0,0) = (—1,0, 1), F(0,0,0,1) = (1,—1,—3) 


By Proposition 5.4, the image vectors span Im F. Hence, form the matrix whose rows are these image 
vectors, and row reduce to echelon form: 


l 1 1l 1 1 1 111 
-1 0 1 0 1 2 012 
1 2 SV) o ı 21" le ew 
I =i 3 0 -2 a4 00 0 


Thus, (1,1,1) and (0, 1,2) form a basis for Im F; hence, dim(Im F) = 2. 
(b) Set F(v) = 0, where v = (x,y,z, t); that is, set 


F(x,y,z,t) =(x-ytz+t, x+2z—t, x+y+ 3z- 3t) = (0,0,0) 


Set corresponding entries equal to each other to form the following homogeneous system whose solution 
space is Ker F: 


x—y+ z+ t=0 x-y+ z+ t=0 

x-y+z+ t=0 
x +2z— t=0 or y+ z—2t=0 or 4: a 
xty+3z—3t=0 2y+2z—4t=0 ee ee 


The free variables are z and ft. Hence, dim(Ker F) = 2. 
(i) Set z= —1, t= 0 to obtain the solution (2, 1, —1,0). 
(ii) Set z= 0, t= 1 to obtain the solution (1, 2, 0, 1). 


Thus, (2,1, —1,0) and (1, 2,0, 1) form a basis of Ker F. 
[As expected, dim(Im F) + dim(Ker F) = 2 + 2 = 4 = dimR’, the domain of F.] 


5.17. Let G: R? — R° be the linear mapping defined by 


G(x, y, z) =(x+2y—z, ytz, x+y-—2z) 
Find a basis and the dimension of (a) the image of G, (b) the kernel of G. 
(a) Find the images of the usual basis of R?: 
G(1,0,0) = (1,0, 1), G(0, 1,0) = (2,1, 1), G(0,0, 1) = (—1, 1, —2) 


By Proposition 5.4, the image vectors span Im G. Hence, form the matrix M whose rows are these image 
vectors, and row reduce to echelon form: 


1 0 1 1 0 1 1 0 1 
M= 2-1 1}~}]0 1 —ljļ~]|O 1 -1 
-1 1 -2 0 1 -l 00 0 
Thus, (1,0, 1) and (0,1, —1) form a basis for Im G; hence, dim(Im G) = 2. 


(b) Set G(v) = 0, where v = (x,y,z); that is, 


G(x,y,z)=(x+2y—z, y+z, x+y —2z) =(0,0,0) 
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Set corresponding entries equal to each other to form the following homogeneous system whose solution 
space is Ker G: 


x+2y— z=0 x+2y—z=0 = 
y+ z=0 or yt+z=0 or ae te 
x+ y—2z=0 -y-z=0 y E 


The only free variable is z; hence, dim(Ker G) = 1. Set z = 1; then y = —1 and x = 3. Thus, (3, —1, 1) 
forms a basis of Ker G. [As expected, dim(Im G) + dim(Ker G) = 2 + 1 = 3 = dim R?, the domain 


of G.] 
1 2 3 1 
5.18. Consider the matrix mapping 4: Rf — R?, where A= |1 3 5 —2 |. Find a basis and the 
3 8 13 -3 


dimension of (a) the image of A, (b) the kernel of A. 


(a) The column space of A is equal to Im A. Now reduce A’ to echelon form: 


1 1 3 1 1 3 113 
geol\2 a sj |0 2 21/01 2 
3 5 B3 0 2 4 00 0 
1 -2 -3 0 -3 -6 00 0 


Thus, {(1, 1,3), (0,1,2)} is a basis of Im A, and dim(Im A) = 2. 


(b) Here Ker A is the solution space of the homogeneous system AX = 0, where X = {x,y,z, t”. Thus, 
reduce the matrix A of coefficients to echelon form: 


123 1 123 1 
012 -3j~jo 12 -3| oœ *t2 TSF ITO 
02 4 -6 000 0 a AE 


The free variables are z and t. Thus, dim(Ker A) = 2. 


(i) Set z= 1, t = 0 to get the solution (1, —2, 1,0). 
Gi) Setz = 0, t= 1 to get the solution (—7, 3,0, 1). 


Thus, (1, —2, 1,0) and (—7,3,0,1) form a basis for Ker A. 


5.19. Find a linear map F : R? > R* whose image is spanned by (1,2,0, —4) and (2,0, —1, —3). 


Form a 4 x 3 matrix whose columns consist only of the given vectors, say 


i 2 2 
2 0 0 
Ne site ae 
a4 =3 3 


Recall that A determines a linear map 4: R? — R* whose image is spanned by the columns of A. Thus, 4 
satisfies the required condition. 


5.20. Suppose f: V — U is linear with kernel W, and that f(v) =u. Show that the ‘‘coset’’ 
v+ W ={v+w:we W} is the preimage of u; that is, f~! (u) = v + W. 
uj. 


We must prove that (i) f~! (u) C v+ W and (ii) v+ WC fo! 
We first prove (i). Suppose v’ € f~! (u). Then f(v’) = u, and so 
fu =v) =f) = fv) =u—u=0 
that is, v' — v € W. Thus, v’ = v + (v' — v) € v+ W, and hence f~! (u) C v+ W. 


CHAPTER 5 Linear Mappings I> 


5.21. 


5.22. 


5.23. 


Now we prove (ii). Suppose v' € v + W. Then v = v + w, where w € W. Because W is the kernel of f, 
we have f(w) = 0. Accordingly, 


S) =flutw) +f) +f) =f(v) +0 =f(v) =u 


Thus, v' € f~! (u), and so v+ W C f~! (u). 
Both inclusions imply f~! (u) = v + W. 


Suppose F : V — U and G: U — W are linear. Prove 


(a) rank(G° F) < rank(G), (b) rank(G° F) < rank(F). 


(a) Because F(V) C U, we also have G(F(V)) C G(U), and so dim[G(F(V))] < dim[G(U)]. Then 
rank(Go F) = dim[(Ge F)(V)] = dim[G(F(V))] < dim[G(U)] = rank(G). 
(b) We have dim[G(F(V))] < dim[F(V)]. Hence, 
rank(G ° F) = dim|(Go F)(V)] = dim[G(F(V))] < dim[F(V)] = rank(F) 


Prove Theorem 5.3: Let F: V — U be linear. Then, 


(a) Im F is a subspace of U, (b) Ker F is a subspace of V. 


(a) Because F(0) = 0, we have 0 € Im F. Now suppose u,u’ € Im F and a,b € K. Because u and uw’ 
belong to the image of F, there exist vectors v, v’ € V such that F(v) = u and F(v') = u’. Then 


F(av + bv’) = aF(v) + bF(v') = au + bu’ € Im F 
Thus, the image of F is a subspace of U. 


(b) Because F(0) = 0, we have 0 € Ker F Now suppose v,w € Ker F and a,b € K. Because v and w 
belong to the kernel of F, F(v) = 0 and F(w) = 0. Thus, 


F(av + bw) = aF (v) + bF(w) = a0+60=0+0=0, and so av + bw € Ker F 
Thus, the kernel of F is a subspace of V. 


Prove Theorem 5.6: Suppose V has finite dimension and F : V — U is linear. Then 
dim V = dim(Ker F) + dim(Im F) = nullity(F) + rank(F’) 


Suppose dim(Ker F) =r and {w,,...,w,} is a basis of Ker F, and suppose dim(Im F) = s and 


{u,,...,u,} is a basis of Im F. (By Proposition 5.4, Im F has finite dimension.) Because every 
uj € Im F, there exist vectors v,,...,v, in V such that F(v,;) = u, ..., F(u) = us. We claim that the set 
B= {Wie Wey Up, +++ U 


is a basis of V; that is, (i) B spans V, and (ii) B is linearly independent. Once we prove (i) and (ii), then 
dim V =r + s = dim(Ker F) + dim(Im F). 


(i) B spans V. Let v € V. Then F(v) € Im F. Because the u; span Im F, there exist scalars a,,...,a, such 
that F(v) = aju +--+: +a,u,. Set 6=a,v, +: + aV, — v. Then 


F(ô) = F(ayy, +--+ + asv — v) = aF (01) + H aF (u) — F(v) 
=a +- + apus — F (v) =0 
Thus, ĉ € Ker F. Because the w; span Ker F, there exist scalars b,,...,5,, such that 
0 = biw +: +b,w, = autta U vu 


Accordingly, 


byw, acest b.w 


P me 


Thus, B spans V. 
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(ii) B is linearly independent. Suppose 


Xywy tees +x,.w, + yy 0, +--+ +y,0, = 9 (1) 
where x; y; E K. Then 
0 = F(0) = F(xyw, +- + xw, +y v H EUs) 
=x,F(w,)+---+x,F(w,) +y Fv) +--+ yE) (2) 


But F(w;)=0, since w;€KerF, and F(v,) =u; Substituting into (2), we will obtain 


yu; +---+y,u; = 0. Since the u; are linearly independent, each y; = 0. Substitution into (1) gives 
X;w, +-+-+x,w, = 0. Since the w; are linearly independent, each x;=0. Thus B is linearly 


independent. 


Singular and Nonsingular Linear Maps, lsomorphisms 


5.24. 


5.25. 


5.26. 


Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero 
vector v whose image is 0. 


(a) F: R? > R? defined by F(x,y) = (x — y, x— 2y). 
(b) G: R? — R? defined by G(x, y) = (2x — 4y, 3x — 6y). 
(a) Find Ker F by setting F(v) = 0, where v = (x,y), 


o o _ x- y=0 x-y=0 
(x—y, x —2y) = (0,0) or on ner or y= 


The only solution is x = 0, y = 0. Hence, F is nonsingular. 
(b) Set G(x, y) = (0,0) to find Ker G: 


2x —4y =0 


(2x — 4y, 3x — 6y) = (0,0) or 3x— 6y =0 


or x—2y=0 


The system has nonzero solutions, because y is a free variable. Hence, G is singular. Let y = 1 to obtain 
the solution v = (2,1), which is a nonzero vector, such that G(v) = 0. 


The linear map F : R? — R? defined by F(x,y) = (x — y, x — 2y) is nonsingular by the previous 
Problem 5.24. Find a formula for F`!. 


Set F(x,y) = (a,b), so that F~! (a,b) = (x,y). We have 


E = x— yp=a x-y=a 
(x —y, x — 2y) = (a,b) or x- y=b or ean 


Solve for x and y in terms of a and b to get x =2a—b, y=a-— b. Thus, 
F`! (a,b) = (2a — b, a—b) or F(x,y) = (2x— y, x— y) 


(The second equation is obtained by replacing a and b by x and y, respectively.) 


Let G: R? — R? be defined by G(x, y) = (x +y, x — 2y, 3x +y). 
(a) Show that G is nonsingular. (b) Find a formula for G7!. 
(a) Set G(x,y) = (0,0,0) to find Ker G. We have 
(x+y, x— 2y, 3x + y) = (0,0,0) o x+y=0, x-2y=0, 3x+y=0 
The only solution is x = 0, y = 0; hence, G is nonsingular. 


(b) Although G is nonsingular, it is not invertible, because R? and R? have different dimensions. (Thus, 
Theorem 5.9 does not apply.) Accordingly, G7! does not exist. 
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5.27. 


5.28. 


5.29. 


Suppose that F : V — U is linear and that V is of finite dimension. Show that V and the image of 
F have the same dimension if and only if F is nonsingular. Determine all nonsingular linear 
mappings T : Rf — R°. 

By Theorem 5.6, dim V = dim(Im F) + dim(Ker F). Hence, V and Im F have the same dimension if 
and only if dim(Ker F) = 0 or Ker F = {0} (i.e., if and only if F is nonsingular). 

Because dim R? is less than dim R4, we have that dim(Im 7) is less than the dimension of the domain 
R‘ of T. Accordingly no linear mapping T : R* — R? can be nonsingular. 


Prove Theorem 5.7: Let F: V — U be a nonsingular linear mapping. Then the image of any 
linearly independent set is linearly independent. 


Suppose v4, V2,...,U, are linearly independent vectors in V. We claim that F(v,),F(v2),...,F(v,) are 
also linearly independent. Suppose a,F'(v,) + aF (v) +---+a,F(v,) = 0, where a; € K. Because F is 
linear, F(a,v,; + a20 +---+a,v,) = 0. Hence, 


QV, + a30 + +++ + 4,0, E Ker F 


But F is nonsingular—that is, Ker F = {0}. Hence, a,v, + azv +---+4,v, = 0. Because the v; are 
linearly independent, all the a; are 0. Accordingly, the F(v;) are linearly independent. Thus, the theorem is 
proved. 


Prove Theorem 5.9: Suppose V has finite dimension and dim V = dim U. Suppose F : V — U is 
linear. Then F is an isomorphism if and only if F is nonsingular. 


If F is an isomorphism, then only 0 maps to 0; hence, F is nonsingular. Conversely, suppose F is 
nonsingular. Then dim(Ker F) = 0. By Theorem 5.6, dim V = dim(Ker F) + dim(Im F). Thus, 


dim U = dim V = dim(Im F) 


Because U has finite dimension, Im F = U. This means F maps V onto U. Thus, F is one-to-one and onto; 
that is, F is an isomorphism. 


Operations with Linear Maps 


5.30. 


5.31. 


5.32. 


Define F : R? — R? and G: R? > R? by F(x,y,z) = (2x, y+z) and G(x,y,z) = (x — z, y). 
Find formulas defining the maps: (a) F + G, (b) 3F, (c) 2F — 5G. 


(a) (F + G) (x,y,z) = F (x,y,z) + G(x, y, Z) = (2x, yt+z) F (x-z, y) = (3x —z, 2y +z) 
(b) (3F)(x,y,z) = 3F (x,y,z) = 3(2x, y +z) = (6x, 3y + 3z) 


(c) (2F — 5G) (x,y,z) = 2F (x,y,z) — 5G(x,y,z) = 2(2x, y+ z) — 5(x— z, y) 
= (4x, 2y + 2z) + (—5x + 5z, —Sy) = (—x + 5z, —3y + 2z) 


Let F : R? > R? and G: R? — R? be defined by F(x, y,z) = (2x, y +z) and G(x,y) = (y, x). 
Derive formulas defining the mappings: (a) G° F, (b) F° G. 

(a) (G° F) (x,y,z) = G(F(x,y,2)) = G(2x, y +2) = (y +z, 2x) 

(b) The mapping F ° G is not defined, because the image of G is not contained in the domain of F. 


Prove: (a) The zero mapping 0, defined by 0(v) = 0 € U for every v € V, is the zero element of 
Hom(V, U). (b) The negative of F € Hom(V, U) is the mapping (—1)F, that is, —F = (—1)F. 
Let F € Hom(V, U). Then, for every v € V: 


(a) (F + 0)(v) = F(v) + 0(v) = F(v) +0 = F(v) 

Because (F + 0)(v) = F (v) for every v € V, we have F + 0 = F Similarly, 0 + F = F. 
(b) (F + (=1)F) (v) = Flv) + (~1)F (v) = F(v) — F(v) = 0 = 0() 
Thus, F + (—1)F = 0. Similarly (-1)F + F = 0. Hence, —F = (—1)F. 
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5.33. 


5.34. 


5.35. 


5.36. 


Suppose F),F,...,F,, are linear maps from V into U. Show that, for any scalars a,,a7,...,d,, 
and for any v E V, 


(aF; T aF, qA a,F,)(v) = aF (v) + aF (v) T eT a,F,(v) 
The mapping aF; is defined by (a}F;)(v) = aF (v). Hence, the theorem holds for n = 1. Accordingly, 
by induction, 
(aF + aF, +--+ + a, Fpa) (0) = (a1 F1) (0) + (aF ++ + a, Fp) (0) 
= aF (v) a a,F,(v) pea a,F,(v) 


Consider linear mappings F: R? — R°, G:R?— R°, H: R> — R? defined by 
F(x,y,z) =(x+y+z, x+y); G(x,y,z) = (2x +z, x+y), A(x, 52) = (2y, x) 
Show that F, G, H are linearly independent [as elements of Hom(R?, R’)]. 
Suppose, for scalars a,b,c € K, 
aF+bG+cH =0 (1) 
(Here 0 is the zero mapping.) For e} = (1,0,0) € R, we have 0(e,) = (0,0) and 
(aF + bG + cH)(e,) = aF(1,0,0) + bG(1,0,0) + cH(1,0,0) 
=a(1,1)+ 6(2,1)+c(0,1) = (a+2b, a+b+c) 
Thus by (1), (a+ 2b, a+b+c) = (0,0) and so 
a+2b=0 and a+b+c=0 (2) 
Similarly for e, = (0, 1,0) € R*, we have 0(e,) = (0,0) and 
(aF + bG + cH)(e,) = aF(0,1,0) + bG(0, 1,0) + cH(0, 1,0) 
= a(1,1) + b(0,1) + c(2,0) = (a+2c, a+b) 


Thus, a+2c=0 and a+b=0 (3) 


Using (2) and (3), we obtain 
a=0, b=0, c=0 (4) 
Because (1) implies (4), the mappings F, G, H are linearly independent. 


Let k be a nonzero scalar. Show that a linear map T is singular if and only if kT is singular. Hence, 
T is singular if and only if —T is singular. 


Suppose T is singular. Then T(v) = 0 for some vector v Æ 0. Hence, 
(kT)(v) = kT (v) = k0 = 0 


and so AT is singular. 
Now suppose kT is singular. Then (kT)(w) = 0 for some vector w Æ 0. Hence, 


T (kw) = kT(w) = (kT)(w) = 0 
But k £0 and w Æ 0 implies kw ¥ 0. Thus, T is also singular. 


Find the dimension d of: 
(a) Hom(R?, Rf), (b) Hom(R5, R°), (c) Hom(P3(4),R’), (d) Hom(M)3, RÍ). 
Use dim[Hom(V, U)] = mn, where dim V = m and dim U = n. 


8. 
24. 


(a) d = 3(4) = 12. (c) Because dim P; (t) = 4, d = 4(2) 
(b) d = 5(3) = 15. (d) Because dim M}; = 6, d = 6(4) 
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5.37. 


5.38. 


Prove Theorem 5.11. Suppose dim V = m and dim U = n. Then dim[Hom(V, U)]| = mn. 


Suppose {v,,..., v,,} is a basis of V and {u,,...,u,} is a basis of U. By Theorem 5.2, a linear mapping 
in Hom(V, U) is uniquely determined by arbitrarily assigning elements of U to the basis elements v; of V. We 
define 


F; € Hom(V, U), i=1,...,m, j=l,...,n 


to be the linear mapping for which F;;(v;) = u;, and F;;(v,) = 0 for k + i. That is, F;; maps v; into u; and the 
other v’s into 0. Observe that {F iit contains exactly mn elements; hence, the theorem is proved if we show 
that it is a basis of Hom(V, U). 


Proof that {F} generates Hom(V,U). Consider an arbitrary function F € Hom(V,U). Suppose 
F(v,) = w,, (v2) = w3,- .., F(Un) = Wm. Because w, € U, it is a linear combination of the u’s; say, 


Wy = Agi Uy + Ayla + `+- + Ay Uy, kK=1,...,m, aj EK (1) 
Consider the linear mapping G = >", yal a;F;. Because G is a linear combination of the F, the proof 
that {F;} generates Hom(V, U) is complete if we show that F = G. 
We now compute G(v,),4 = 1,...,m. Because F;;(v,) = 0 for k Ai and Fy(v,) = u;, 


G(r) = Od ay Fj (U¢) = Do ayFylv) = Do aguj 
i=1j=1 jel j=l 
= Ay Uy + Apat H+ F Ay Uy 
Thus, by (1), G(u,) = w; for each k. But F(v,) = w, for each k. Accordingly, by Theorem 5.2, F = G; 
hence, {F;;} generates Hom(V, U). 


Proof that {F;} is linearly independent. Suppose, for scalars c; € K, 


ey cP = 0 
i=1j=1 
For v}, k =1,...,m, 
0 = O(r) = X 2 cghy (Ue) = Do Cap (Ue) = Lo yey 
i=l j=l j=l j=l 
= Chy F ChoUa T+ + Clty 
But the u; are linearly independent; hence, for k = 1,...,m, we have c,; = 0, cy = 0, . . . , cy, = 0. In other 


words, all the c; = 0, and so {F;} is linearly independent. 


Prove Theorem 5.12: (i) Go(F+F")=GeoF+GoF". (ii) (G+ G’)oF=GoF+G'oF. 
(iii) k(Go F) = (kG) o F = Go (kF). 
(i) For every ve V, 


(Go (F + F))(v) = GF + F')(v)) = GF (a) + F'(v)) 
= G(F(v)) + G(F'(v)) = (Ge F)(v) + (G° F')(v) = (Go F + Go F')(v) 


Thus, Go(F + F’) = GoF+GoF". 
(ii) For every v E€ V, 


((G+ 6) oF\(v) = (G+ G)(F(v)) = GF(w)) + G(F(0)) 
= (GoF)(v) + (Go F)(v) = (GoF + GoF)(v) 


Thus, (G+ G')oF=GeF+G'oF. 
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(iii) For every v E€ V, 


(k(GoF))(u) = k(Ge F) (v) = k(G(F(0))) = (KG)(F(v)) = (kG ° F)(2) 
and 
(k(GoF))(u) = k(Go F)(v) = k(G(F(v))) = G(KF(0)) = G((KF)(0v)) = (Ge KF)(0) 


Accordingly, k(G ° F) = (kG) ° F = Go (kF). (We emphasize that two mappings are shown to be equal 
by showing that each of them assigns the same image to each point in the domain.) 


Algebra of Linear Maps 


5.39. Let F and G be the linear operators on R? defined by F(x,y) = (v,x) and G(x,y) = (0,x). Find 
formulas defining the following operators: 
(a) F+G, (b)2F—3G, (c) FG, (d) GF, (© F°, (£) @. 


(a) (F+G)(x,y) = F(x,y) + G(x») = (7, x) + (0,x) = (p, 2x). 

(b) (2F — 3G)(x,y) = 2F (x,y) — 3G(x,y) = 2(y,x) — 3(0,x) = (2y, —x). 

(c) (FG)(x,y) = F(G(x,y)) = F(0,x) = (x,0). 

(d) (GF)(x,y) = G(F(x,y)) = GW, x) = (0,9). 

(e) F*(x,y) = F(F(x,y)) = F(y,x) = (x,y). (Note that F? = J, the identity mapping.) 
(f) G (x,y) = G(G(x,y)) = G(0,x) = (0,0). (Note that G? = 0, the zero mapping.) 


5.40. Consider the linear operator T on R? defined by T(x,y,z) = (2x, 4x—y, 2x+3y—z). 
(a) Show that T is invertible. Find formulas for (b) T~', (c) T?, (d) T~?. 


(a) Let W = Ker T. We need only show that T is nonsingular (i.e., that W = {0}). Set T(x, y,z) = (0,0, 0), 
which yields 


T(x,y,z) = (2x, 4x — y, 2x + 3y — z) = (0,0, 0) 
Thus, W is the solution space of the homogeneous system 
2x = 0, 4x -—y=0, 2x+3y—z=0 


which has only the trivial solution (0,0,0). Thus, W = {0}. Hence, T is nonsingular, and so T is 
invertible. 


(b) Set T(x,y,z) = (r,s,t) [and so T7! (r,s, t) = (x,y,z)]. We have 
(2x, 4x — y, 2x+3y—z) = (r,s,t) or 2x=r, 4x-y=s, 2x+3y-z=t 
Solve for x, y, z in terms of r, s, t to get x = br, y =2r— s, z = Tr — 3s — t. Thus, 
T! (r,s,t) = ($r, 2r—s, Tr —3s-— t) or T(x, y,z) = ($x, 2x— y, 7x— 3y- z) 
(c) Apply T twice to get 


T? (x,y,z) = T(2x, 4x — y, 2x +3y-— z) 
[4x, 4(2x) — (4x — y), 2(2x) + 3(4x — y) — (2x + 3y — z)] 
(4x, 4x +y, 14x — 6y +2) 


(d) Apply T7! twice to get 


T (x,y,z) =T (Gx, 2x- y, 7x — 3y- z) 
=— 26%)-@r—y), 7Gx)—3(@e —y) — (7x — 3y =2)] 
=(jx, -x+y, — xt 6y+z) 


x 
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5.41. 


5.42. 


5.43. 


5.44. 


Let V be of finite dimension and let T be a linear operator on V for which TR = J, for some 
operator R on V. (We call R a right inverse of T.) 


(a) Show that T is invertible. (b) Show that R = T~!. 
(c) Give an example showing that the above need not hold if V is of infinite dimension. 


(a) Let dim V = n. By Theorem 5.14, T is invertible if and only if T is onto; hence, T is invertible if and 
only if rank(T) = n. We have n = rank(/) = rank(7R) < rank(T) < n. Hence, rank(T) = n and T is 


invertible. 
Of = T!T = I. Then R = IR = (T-'T)R = T! (TR) = aT, 
(c) Let V be the space of polynomials in t over K; say, p(t) = a) + aıt + aP +--+ +a,ť. Let T and R be 
the operators on V defined by 
T(p(t)) =O+a,+at+---+a,! and = R(p(t)) = ant ta +---+a,e7! 
We have 


(ZR)(p(t)) = T(R(p(t))) = T(agt + ay? +-+- + att) = ag tat +--+ + at = pit) 
and so TR = J, the identity mapping. On the other hand, if k € K and k 4 0, then 
(RT)(k) = R(T(k)) = RO) =O Fk 
Accordingly, RT # I. 


Let F and G be linear operators on R? defined by F(x,y) = (0,x) and G(x,y) = (x,0). Show that 
(a) GF = 0, the zero mapping, but FG Æ 0. (b) G = G. 
(a) (GF)(x,y) = G(F(x,y)) = G(0,x) = (0,0). Because GF assigns 0 = (0,0) to every vector (x, y) in R’, 
it is the zero mapping; that is, GF = 0. 
On the other hand, (FG)(x,y) = F(G(x,y)) = F(x, 0) = (0,x). For example, (FG)(2,3) = (0,2). 
Thus, FG # 0, as it does not assign 0 = (0,0) to every vector in R?. 
(b) For any vector (x,y) in R?, we have G? (x,y) = G(G(x,y)) = G(x, 0) = (x,0) = G(x, y). Hence, G? = G. 


Find the dimension of (a) A(R“), (b) A(P3(t)), (c) A(M33). 


Use dim[A(V)] = n2 where dim V = n. Hence, (a) dim[A(R*)] = 4? = 16, (b) dim[4(P,(r))] = 3? = 9, 
(c) dim[A(M, 3)] = 6? = 36. 


Let E be a linear operator on V for which E* = E. (Such an operator is called a projection.) Let U 
be the image of E, and let W be the kernel. Prove 


(a) Ifu € U, then E(u) = u (i.e., E is the identity mapping on U). 
(b) IfE #72, then E is singular—that is, E(v) = 0 for some v 4 0. 
(c) V=UOW. 
(a) If u € U, the image of E, then E(v) = u for some v € V. Hence, using E? = E, we have 
u = E(v) = E’ (v) = E(E(v)) = E(u) 
(b) If E AJ, then for some v € V, E(v) = u, where v Æ u. By (i), E(u) = u. Thus, 
E(v — u) = E(v) — E(u) =u-—u=0, where v—-u#0 
(c) We first show that V = U + W. Let v € V. Set u = E(v) and w = v — E(v). Then 
v = E(v) + v— E(v)=u+w 
By definition, u = E(v) € U, the image of E. We now show that w € W, the kernel of E, 
E(w) = E(v — E(v)) = E(v) — E’ (v) = E(v) — E(v) = 0 


and thus w € W. Hence, V = U + W. 

We next show that U N W = {0}. Let v € U N W. Because v € U, E(v) = v by part (a). Because 
v € W, E(v) = 0. Thus, v = E(v) = 0 and so UN W = {0}. 

The above two properties imply that V = UG W. 
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Mappings 


5.45. 


5.46. 


5.47. 


5.48. 


Determine the number of different mappings from (a) {1,2} into {1,2,3}, (b) {1,2,...,r} into {1,2,...,s}. 


Let f : R — R and g: R — R be defined by f(x) = x? + 3x + 1 and g(x) = 2x — 3. Find formulas defining 
the composition mappings: (a) fog: (b) g°/; (©) g°g; (d) fof 


For each mappings f : R — R find a formula for its inverse: (a) f(x) = 3x — 7, (b) f(x) = x3 +2. 


For any mapping f : A — B, show that lpof =f =foly. 


Linear Mappings 


5.49. 


5.50. 


5.51. 


5.52. 


5.53. 


5.54. 


5.55. 


5.56. 


5.57. 


5.58. 


5.59. 


5.60. 


Show that the following mappings are linear: 
(a) F:R? — R? defined by F(x,y,z) = (x + 2y — 3z, 4x — 5y + 62). 
(b) F:R? — R? defined by F(x,y) = (ax + by, cx + dy), where a, b, c, d belong to R. 


Show that the following mappings are not linear: 
(a) F:R* — R? defined by F(x,y) = (x,y). 

(b) F:R? — R? defined by F(x,y,z) = (x +1, y+ z). 
(c) F:R? — R? defined by F(x,y) = (xy, y). 

(d) F: R? — R? defined by F(x,y,z) = (|x|, y+z). 


Find F(a, b), where the linear map F : R? — R? is defined by F(1,2) = (3, —1) and F(0, 1) = (2,1). 
Find a 2 x 2 matrix A that maps 

(a) (1,3) and (1,4) into (—2, 5)” and B1; respectively. 

(b) (2,—4)" and (—1,2)" into (1, 1)” and (1,3)", respectively. 

Find a 2 x 2 singular matrix B that maps (1, 1)” into (1,3). 


Let V be the vector space of real n-square matrices, and let M be a fixed nonzero matrix in V. Show that the 
first two of the following mappings 7: V — V are linear, but the third is not: 
(a) T(A) = MA, (b) T(A) = AM + MA, (c) T(A) =M +A. 


Give an example of a nonlinear map F : R? — R° such that F~! (0) = {0} but F is not one-to-one. 


Let F : R? — R? be defined by F(x,y) = (3x + 5y, 2x + 3y), and let S be the unit circle in R?. (S consists 
of all points satisfying x? + y? = 1.) Find (a) the image F(S), (b) the preimage F~! (S). 


Consider the linear map G: R? — R? defined by G(x,y,z) = (x +y+z, y—2z, y—3z) and the unit 
sphere S, in R*, which consists of the points satisfying x? + y* + z* = 1. Find (a) G(S), (b) G7! (S2). 


Let H be the plane x+2y—3z=4 in R? and let G be the linear map in Problem 5.57. Find 
(a) G(H), (b) G"!(#). 


Let W be a subspace of V. The inclusion map, denoted by i: W — V, is defined by i(w) = w for every 
w E€ W. Show that the inclusion map is linear. 


Suppose F : V — U is linear. Show that F(—v) = —F (v). 


Kernel and Image of Linear Mappings 


5.61. 


For each linear map F find a basis and the dimension of the kernel and the image of F: 
(a) F:R? — R? defined by F(x,y,z) = (x + 2y — 3z, 2x+5y—4z, x+4y+2), 
(b) F:R* — R? defined by F(x, y,z,f) = (x + 2y +3z + 2t, 2xn+4y+7z+5t, x+2y+6z+5t). 
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5.62. For each linear map G, find a basis and the dimension of the kernel and the image of G: 
(a) G:R? — R? defined by G(x,y,z) = (x+y +z, 2x+2y+422z), 
(b) G:R? — R? defined by G(x,y,z) = (x+y, y+2), 
(c) G:R° — R? defined by 
G(x, y,2,5,t) = (x +2y+2z+s+t, x+2y4+3z+2s—t, 3x+6y+8z+5s—12). 


5.63. Each of the following matrices determines a linear map from R4 into R°: 


1 20 1 10 2 -l 
(a) A=]2 -1 2 -1],0) B=] 23 -1 1 
i -3 2 2 -2 0 -5 3 


Find a basis as well as the dimension of the kernel and the image of each linear map. 
5.64. Find a linear mapping F : R? — R? whose image is spanned by (1,2,3) and (4, 5, 6). 
5.65. Find a linear mapping G: Rf — R? whose kernel is spanned by (1,2,3,4) and (0, 1, 1, 1). 
5.66. Let V = P(t), the vector space of polynomials of degree < 10. Consider the linear map D’: V — V, where 


Df denotes the fourth derivative d*( f) /dt*. Find a basis and the dimension of 
(a) the image of D*; (b) the kernel of D*. 


5.67. Suppose F: V — U is linear. Show that (a) the image of any subspace of V is a subspace of U; 
(b) the preimage of any subspace of U is a subspace of V. 


5.68. Show that if F: V — U is onto, then dim U < dim V. Determine all linear maps F : R? — R' that are onto. 


5.69. Consider the zero mapping 0: V — U defined by 0(v) = 0,V v € V. Find the kernel and the image of 0. 


Operations with linear Mappings 
5.70. Let F: R? — R? and G: R? — R? be defined by F(x,y,z) = (y, x +z) and G(x,y,z) = (2z, x — y). Find 
formulas defining the mappings F + G and 3F — 2G. 


5.71. Let H: R? — R° be defined by H(x,y) = (y, 2x). Using the maps F and G in Problem 5.70, find formulas 
defining the mappings: (a) H° F and H° G, (b) FoH and GoH, (c) Ho(F + G) and HoF +H°G. 


5.72. Show that the following mappings F, G, H are linearly independent: 
(a) F,G,H € Hom(R?, R?) defined by F(x,y) = (x,2y), G.y)=(%, x+y), H(x,y) = (0,x), 
(b) F,G,H € Hom(R?,R) defined by F(x, y,z) =x+y +z, G(x,y,z)=y+z, H(x,y,z)=x-z. 


5.73. For F, G € Hom(V, U), show that rank(F + G) < rank(F) + rank(G). (Here V has finite dimension.) 


5.74. Let F: V — U and G: U — V be linear. Show that if F and G are nonsingular, then Go F is nonsingular. 
Give an example where Go F is nonsingular but G is not. [Hint: Let dim V < dim U.] 


5.75. Find the dimension d of (a) Hom(R’, RÌ), (b) Hom(P,(t), R°), (c) Hom(Mb 4, P2(¢)). 

5.76. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero vector v 
whose image is 0; otherwise find a formula for the inverse map: 
(a) F:R? — R? defined by F(x,y,z) = (x+y +z, + 3y+5z, x+3y+72z), 


(b) G:R? — P,(t) defined by G(x,y,z) = (x+y)? 4 ra + 2y 
(c) H:R* — P,(t) defined by H(x,y) = (x + 2y)? + (x —y)t 


5.77. When can dim [Hom(V, U)| = dim V? 
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Algebra of Linear Operators 


5.78. 


5.79. 


5.80. 


5.81. 


5.82. 


5.83. 


Let F and G be the linear operators on R? defined by F(x,y) = (x+y, 0) and G(x, y) = (—y, x). Find 
formulas defining the linear operators: (a) F + G, (b) 5F — 3G, (c) FG, (d) GF, (e) F?, (f) @. 


Show that each linear operator T on R? is nonsingular and find a formula for T -1 where 
(a) T(x, y) = (x + 2y, 2x + 3y), (b) T(x, y) = (2x — 3y, 3x — 4y). 


Show that each of the following linear operators T on R? is nonsingular and find a formula for T~!, where 
(a) T(x,y,Z) = (x= 3y — 2z, fa 4z, z); (b) T(x, y,z) = (x +z, x—=y, y). 


Find the dimension of A(V), where (a) V = R7, (b) V = P; (t), (©) V = M; 4. 


Which of the following integers can be the dimension of an algebra A(V) of linear maps: 
5, 9, 12, 25, 28, 36, 45, 64, 88, 100? 


Let T be the linear operator on R? defined by 7(x,y) = (x + 2y, 3x + 4y). Find a formula for f(T), where 
ASA =P +20=3, b) f(A =P- 5t-2. 


Miscellaneous Problems 


5.84. 


5.85. 


5.86. 


5.87. 


5.88. 


5.89. 


Suppose F : V — U is linear and k is a nonzero scalar. Prove that the maps F and kF have the same kernel 
and the same image. 


Suppose F and G are linear operators on V and that F is nonsingular. Assume that V has finite dimension. 
Show that rank(FG) = rank(GF) = rank(G). 


Suppose V has finite dimension. Suppose T is a linear operator on V such that rank(T?) = rank(T). Show 
that Ker T N Im T = {0}. 


Suppose V = U 9 W. Let E, and E, be the linear operators on V defined by E; (v) = u, E,(v) = w, where 
v=u+w, u€ U, w€ W. Show that (a) E? = E, and E$ = E, (i.e., that E} and E, are projections); 
(b) E, +E, =I, the identity mapping; (c) E,E, = 0 and E,E = 0. 


Let E£; and E, be linear operators on V satisfying parts (a), (b), (c) of Problem 5.88. Prove 
V = Im £ 9 Im E, 


Let v and w be elements of a real vector space V. The line segment L from v to v + w is defined to be the set 
of vectors v + tw for 0 <t< 1. (See Fig. 5.6.) 


(a) Show that the line segment L between vectors v and u consists of the points: 
G) (1 — t)v + tu fr 0 < t< 1, (ii) 4v + bu for t +6 = 1, t > 0, f > 0. 
(b) Let F: V — U be linear. Show that the image F(L) of a line segment L in V is a line segment in U. 


Figure 5-6 
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5.90. Let F: V — U be linear and let W be a subspace of V. The restriction of F to W is the map F|W: W — U 
defined by F|W(v) = F(v) for every v in W. Prove the following: 
(a) F|W is linear; (b) Ker(F|W) = (Ker F) NA W; (c) Im(F|W) = F(W). 


5.91. A subset X of a vector space V is said to be convex if the line segment L between any two points (vectors) 
P,Q € X is contained in X. (a) Show that the intersection of convex sets is convex; (b) suppose F : V — U 
is linear and X is convex. Show that F(X) is convex. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


5.45. (a) 3? =9, (b) s” 


5.46. (a) (fog)(x) =47 +1, (b) (gef)(x) = 2x7 + 6x — 1, (© (geg)(x) = 4-9, 
(d) (f of )(x) = x4 + 6x3 + 14x? + 15x +5 


5.47. (a) f(x) =3 (x +7), fE) = Vx -2 


5.49. F (x,y,z) = A(x, y,z)’, where (a) A = f b “al (b) A= f A 


5.50. (a) u = (2,2), k = 3; then F(ku) = (36,36) but kF (u) = (12, 12); (b) F(0) 4 0; 
(c) u = (1,2), v = (3,4); then F(u + v) = (24,6) but F(u) + F(v) = (14, 6); 
(d) u = (1,2,3), k = —2; then F(ku) = (2, —10) but kF(u) = (—2, —10). 


5.51. F(a,b) = (-a+2b, —3a + b) 


—17 5 


5.52. (a) A= | =e 


| ; (b) None. (2, —4) and (—1,2) are linearly dependent but not (1, 1) and (1, 3). 
1 0 n pa T 

5.53. B= [Hint: Send (0,1)° into (0,0) .] 

5.55. F(x,y) = (x,y?) 

5.56. (a) 13x? — 42xy + 34y? = 1, (b) 13x? + 42xy + 34y? = 1 


5.57. (a) x? — 8xy + 26y? + 6xz — 38yz + 142? = 1, (b) x? + 2xy + 3y? + 2xz — 8yz + 1427 = 1 


5.58. (a)x— y+ 2z = 4, (b) x+ 6z = 4 


5.61. (a) dim(Ker F) = 1, {(7,—2, 1)}; dim(Im F) = 2, {(1,2,1), (0,1,2)}; 
(b) dim(Ker F) = 2, {(—2,1,0,0), (1,0, —1,1)}; dim(Im F) = 2, {(1,2,1), (0,1,3)} 


5.62. (a) dim(Ker G) = 2, {(1,0,—1), (1,—1,0)}; dim(Im G) = 1, {(1,2)}; 

(b) dim(Ker G) = 1, {(1,—1,1)}; Im G = R°, {(1,0), (0, 1)}; 

(c) dim(Ker G) = 3, {(—2, 1,0,0,0), (1,0,—1,1,0), (—5,0,2,0,1)}; dim(Im G) = 2, 
{(1,1,3), (0,1,2)} 


5.63. (a) dim(Ker A) = 2, {(4,—2,—5,0), (1,—3,0,5)}; dim(Im A) = 2, {(1,2,1), (0,1,1)}; 
(b) dim(Ker B) = 1, {(-1,2,1, 1}; Im B = R? 


> 


’ 


5.64. F (x,y,z) = (x + 4y, 2x+5y, 3x + 6y) 
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5.65. F(x,y,z,t) =(x+y—z, 2x+y-1,0) 

5.66. (a) {1,4,7,...,0°}, (b) {1,4 P,P} 

5.68. None, because dim R* > dim R°. 

5.69. Ker0=V, Im0= {0} 

5.70. (F+G)(x,y,z) = (y+ 2z, 2x— y +z), (3F —2G)(x,y,z) = (3y — 4z, x + 2y + 3z) 


5.71. (a) (H° F)(x,y,z) = (x+z, 2y), (H° G)(x,y,z) = (x— y, 4z); (b) not defined; 
(c) (He (F +G))(x,»,2) = (HoF + HeG)(x,y,2) = (2x -y +z, 2 +42) 


5.74. F(x,y) = (x,y,y), G(x, y, z) = (x,y) 


5.75. (a) 16, (b) 15, (c) 24 


5.76. (a) v= (2,—3,1); (b) GaP +bt+c)=(b-2c, a-—b+2c, a+b—c); 
(c) H is nonsingular, but not invertible, because dim P, (£) > dim R°. 


5.77. dimU = l; that is, U = K. 

5.78. (a) (F + G)(x,y) = (xx); (b) (5F — 3G)(x,y) = (5x + 8y, —3x); (©) (FG)(x,y) = (x-y, 0); 
(d) (GF)(x,y) = (0, x+y); (e) F°(z,y) = (x+y, 0) (note that F? = F); (f) G?(x,y) =(-x, —y). 
[Note that G? + J = 0; hence, G is a zero of f(t) =? +1.] 

5.79. (a) T! (x,y) = (—3x + 2y, 2x— y), (b) T7!(x,y) = (—4x + 3y, —3x + 2y) 

5.80. (a) T! (x,y,z) = (x + 3y + 14z, y— 4z, z), (b) T! (x,y,z) = (y +z, y, x— y- z) 

5.81. (a) 49, (b) 36, (c) 144 


5.82. Squares: 9, 25, 36, 64, 100 


5.83. (a) T(x,y) = (6x + 14y, 2lx+ 27y); (b) T(x, y) = (0,0)—that is, f(T) = 0 


CHAPTER 6 


Linear Mappings 
and Matrices 


6.1 Introduction 


Consider a basis S = {u,,u>,...,u,,} of a vector space V over a field K. For any vector v € V, suppose 


V = QU, + ay + +++ + apun 


Then the coordinate vector of v relative to the basis S, which we assume to be a column vector (unless 
otherwise stated or implied), is denoted and defined by 
T 
[ul = a, a, Baa fA 
Recall (Section 4.11) that the mapping v— [v] ç, determined by the basis S, is an isomorphism between V 
and K”. 

This chapter shows that there is also an isomorphism, determined by the basis S, between the algebra 
A(V) of linear operators on V and the algebra M of n-square matrices over K. Thus, every linear mapping 
F:V — V will correspond to an n-square matrix [F], determined by the basis S. We will also show how 
our matrix representation changes when we choose another basis. 


6.2 Matrix Representation of a Linear Operator 


Let T be a linear operator (transformation) from a vector space V into itself, and suppose 
S = {uj,u,...,u,} is a basis of V. Now T(u,), T(u),...,7(u,) are vectors in V, and so each is a 
linear combination of the vectors in the basis S; say, 


T (uy) = yyy + aiat +++ + Ay Uy 
T (Uy) = Gy Uy + Aggy +++ + Ay Uy 
T(u,) = ayy, Qn2Uuy AnnUn 


The following definition applies. 


DEFINITION: The transpose of the above matrix of coefficients, denoted by ms(T) or [T]ç, is called 
the matrix representation of T relative to the basis S, or simply the matrix of T in the 
basis S. (The subscript S may be omitted if the basis S is understood.) 

Using the coordinate (column) vector notation, the matrix representation of T may be written in the 
form 


ms(T) = [T]s = [[T(mls, [T]; «+++ [Pen)Is] 


That is, the columns of m(T) are the coordinate vectors of T(u,), T(u),..., 7 (u,), respectively. 


— o 
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EXAMPLE 6.1 Let F: R? — R°? be the linear operator defined by F(x,y) = (2x + 3y, 4x — 5y). 


(a) Find the matrix representation of F relative to the basis S = {u1, u2} = {(1,2), (2,5)}. 


(1) First find F(u), and then write it as a linear combination of the basis vectors u, and uz. (For notational 
convenience, we use column vectors.) We have 


rwi-A()-LJ-Eo8] 2322 


Solve the system to obtain x = 52, y = —22. Hence, F(u,) = 52u, — 22uy. 


(2) Next find F'(u,), and then write it as a linear combination of uw, and wp: 


ro = e828 


Solve the system to get x = 129, y = —55. Thus, F(u) = 129u, — 55u. 
Now write the coordinates of F(u,) and F(u,) as columns to obtain the matrix 


m= [2 2 


(b) Find the matrix representation of F relative to the (usual) basis E = {e,,e,} = {(1,0), (0,1)}. 
Find F(e,) and write it as a linear combination of the usual basis vectors e, and e,, and then find F(e,) and 
write it as a linear combination of e; and e,. We have 


F(e,) = F(1,0) = (2,2) =2e, +4e, 


2 3 
Fe) =F0,= G6. S%,—t6 S F=] | 


4 -5 


Note that the coordinates of F(e,) and F(e,) form the columns, not the rows, of [F],. Also, note that the 
arithmetic is much simpler using the usual basis of R°. 


EXAMPLE 6.2 Let V be the vector space of functions with basis S = {sin t, cos t, e}, and let D: V > V 
be the differential operator defined by D( f(t)) = d( f (t))/dt. We compute the matrix representing D in 
the basis S: 


D(sint)= cost= 0(sint) + 1(cost) + 0(e%t) 
D(cos t) = — sint = —1(sint) + 0(cos t) + 0(e*) 
D(e*) = 3e%= O(sint) + 0(cost) + 3(e*) 


0 
and so [D] = | 1 
0 


Note that the coordinates of D(sin¢), D(cos +t), D(e*”) form the columns, not the rows, of [D]. 


Matrix Mappings and Their Matrix Representation 


Consider the following matrix 4, which may be viewed as a linear operator on R’, and basis S of R?: 


aG S] at stevens {fh 


(We write vectors as columns, because our map is a matrix.) We find the matrix representation of A 
relative to the basis S. 
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(1) First we write A(u;) as a linear combination of u, and u. We have 
o f3 =2]f1]_ f=] fi 2 x+2y= -1 
wp JEE ee g 


Solving the system yields x = 7, y = —4. Thus, A(u,) = 7u; — 4u. 


(2) Next we write A(u,) as a linear combination of u; and uy. We have 


-p AB -Elok] we 22823 


Solving the system yields x = —6, y= 1. Thus, A(u.) = —6u, + u. Writing the coordinates of 
A(u;) and A(uy) as columns gives us the following matrix representation of A: 


mE 


Remark: Suppose we want to find the matrix representation of A relative to the usual basis 
E = {e,,e} = {[1,0]", [0,1]’} of R?. We have 


a= slo] ijara adso M= la 3] 


so- E Ej- E 


Note that [A]; is the original matrix A. This result is true in general: 


The matrix representation of any n x n square matrix A over a field K relative to the 
usual basis £ of K” is the matrix A itself; that is, 


[A]; =Á 


Algorithm for Finding Matrix Representations 
Next follows an algorithm for finding matrix representations. The first Step 0 is optional. It may be useful 
to use it in Step 1(b), which is repeated for each basis vector. 
ALGORITHM 6.1: The input is a linear operator T on a vector space V and a basis 
S = {u,,u2,...,u,} Of V. The output is the matrix representation [7]. 
Step 0. Find a formula for the coordinates of an arbitrary vector v relative to the basis S. 
Step 1. Repeat for each basis vector u, in S: 
(a) Find T(u,). 
(b) Write T(u,) as a linear combination of the basis vectors u, Uz, ..., Up. 


Step 2. Form the matrix [7], whose columns are the coordinate vectors in Step 1(b). 


EXAMPLE 6.3 Let F: R? — R? be defined by F(x,y) = (2x + 3y, 4x — Sy). Find the matrix representa- 
tion [F]; of F relative to the basis S = {u,,u,} = {(1,—2), (2,—5)}. 
(Step 0) First find the coordinates of (a,b) € R? relative to the basis S. We have 


a} 1 2 x+2yvy=a x+2y=a 
[l=] els] ft Syab ™ ~y =2a+5 
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Solving for x and y in terms of a and b yields x = 5a + 2b, y = —2a — b. Thus, 
(a,b) = (5a + 2b)u, + (—2a — b)uy 


(Step 1) Now we find F(u;) and write it as a linear combination of u, and uy using the above formula for (a, b), 
and then we repeat the process for F(u,). We have 


F(u) = 
F(u) = 


(1,-2) = (—4,14) = 8u, — 6u 


F 
F(2,—5) = (-11,33) = 11u — 11u, 


(Step 2) Finally, we write the coordinates of F(u,) and F(u,) as columns to obtain the required matrix: 
8 11 
[F ls ~~ | —6 —-ll | 


Properties of Matrix Representations 


This subsection gives the main properties of the matrix representations of linear operators T on a vector 
space V. We emphasize that we are always given a particular basis S of V. 

Our first theorem, proved in Problem 6.9, tells us that the ‘‘action’’ of a linear operator T on a vector v 
is preserved by its matrix representation. 


THEOREM 6.1: Let 7: V — V be a linear operator, and let S be a (finite) basis of V. Then, for any 
vector v in V, [T]s[u]s = [T(v)]s. 


EXAMPLE 6.4 Consider the linear operator F on R? and the basis S of Example 6.3; that is, 
F(x,y) = (2x+3y, 4x — 5y) and = S={uy,u,}={(1,-2), (2,—5)} 

Let 
v= (5,-7), andso F(v)= (— 11,55) 

Using the formula from Example 6.3, we get 
[vo] = [11,—3] and — [F(v)] = [55, —33]" 


We verify Theorem 6.1 for this vector v (where [F] is obtained from Example 6.3): 


maa= allil =|- ro 


Given a basis S of a vector space V, we have associated a matrix |T] to each linear operator T in the 
algebra A(V) of linear operators on V. Theorem 6.1 tells us that the ‘‘action’’ of an individual linear 
operator T is preserved by this representation. The next two theorems (proved in Problems 6.10 and 6.11) 
tell us that the three basic operations in A(V) with these operators—namely (i) addition, (ii) scalar 
multiplication, and (iii) composition—are also preserved. 


THEOREM 6.2: Let V be an n-dimensional vector space over K, let S be a basis of V, and let M be 
the algebra of n x n matrices over K. Then the mapping 
m:A(V) — M defined by mT) = |T]; 
is a vector space isomorphism. That is, for any F,G € A(V) and any k € K, 
(i) m(F+G)=m(F)+m(G) or [F +G] = [F] + [G] 


Gi) m(kF)= km(F) or [kF] = k[F] 
(iii) m is bijective (one-to-one and onto). 
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THEOREM 6.3: For any linear operators F,G € A(V), 
m(GoF) = m(G)m(F) or [Go F] = [GIF] 


(Here Go F denotes the composition of the maps G and F.) 


6.3 Change of Basis 


Let V be an n-dimensional vector space over a field K. We have shown that once we have selected a basis 
S of V, every vector v € V can be represented by means of an n-tuple [v]; in K”, and every linear operator 
T in A(V) can be represented by an n x n matrix over K. We ask the following natural question: 


How do our representations change if we select another basis? 


In order to answer this question, we first need a definition. 


DEFINITION: Let S = {u,,uy,...,u,,} be a basis of a vector space V, and let S’ = {v, v2,..., Up} 
be another basis. (For reference, we will call S the ‘‘old’’ basis and S’ the ‘‘new’’ 
basis.) Because S is a basis, each vector in the ‘‘new’’ basis S’ can be written uniquely 
as a linear combination of the vectors in S; say, 


Uy = Ay Uy T 4122 T+ T ayy, 
Vy = Aq Uy T A2 T° T Ag Uy, 
Un = An YY Qn2U2 AnnUn 


Let P be the transpose of the above matrix of coefficients; that is, let P = Pil, where 
Pi = aji Then P is called the change-of-basis matrix (or transition matrix) from the 
“*old’’ basis S to the ‘‘new’’ basis S’. 


The following remarks are in order. 


Remark 1: The above change-of-basis matrix P may also be viewed as the matrix whose columns 
are, respectively, the coordinate column vectors of the ‘‘new’’ basis vectors v, relative to the ‘‘old’’ basis 
S; namely, 


Remark 2: Analogously, there is a change-of-basis matrix Q from the ‘‘new’’ basis S’ to the 
“‘old’’ basis S. Similarly, OQ may be viewed as the matrix whose columns are, respectively, the coordinate 
column vectors of the ‘‘old’’ basis vectors u; relative to the ‘‘new’’ basis S’; namely, 


Q= [luis [ua] sr, ae) [unl] 


Remark 3: Because the vectors v4, v5,...,v, in the new basis S’ are linearly independent, the 
matrix P is invertible (Problem 6.18). Similarly, Q is invertible. In fact, we have the following 
proposition (proved in Problem 6.18). 


PROPOSITION 6.4: Let P and Q be the above change-of-basis matrices. Then Q = P™!. 


Now suppose S = {u1, u2, ...,U„} is a basis of a vector space V, and suppose P = [p,] is any 
nonsingular matrix. Then the n vectors 


U; = Piit; + Paita + `+ F Prins i=1,2,...,n 


corresponding to the columns of P, are linearly independent [Problem 6.21(a)]. Thus, they form another 
basis S’ of V. Moreover, P will be the change-of-basis matrix from S to the new basis S’. 
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EXAMPLE 6.5 Consider the following two bases of R°: 
S = {u, u} = {(1,2), (3,5)} and S = {v v} = {(1,—1), (1,=2)} 


(a) Find the change-of-basis matrix P from S to the ‘‘new’’ basis S’. 
Write each of the new basis vectors of S” as a linear combination of the original basis vectors u, and u, of 
S. We have 


1; Jl 3 x+3y=1 eee E _ 
= =x[3] f5] or 2x + 5y= -1 yielding x=-8, y=3 

1} a E ala x+3y=1 eee ae _ 
il =a] JH or 2x+5y= -1 yielding x=-ll, y=4 

Thus, 
v = —8u, + 3u —8 -ll 
and hence, P= 

v = —11lu, + 4u 3 4 


Note that the coordinates of v; and v, are the columns, not rows, of the change-of-basis matrix P. 


(b) Find the change-of-basis matrix Q from the ‘‘new’’ basis S” back to the ‘‘old’’ basis S. 
Here we write each of the ‘‘old’’ basis vectors u, and u, of S’ as a linear combination of the ‘‘new’’ basis 
vectors v; and v, of S’. This yields 


uj = 4u — 3v 


4 1l 
eee ie and hence, Q= R | 


3 —8 


As expected from Proposition 6.4, Q = P~'. (In fact, we could have obtained Q by simply finding P~!.) 


EXAMPLE 6.6 Consider the following two bases of R°: 


E = {e,,6,e,} ={(1,0,0), (0,1,0), (0,0, 1)} 
and S = {u u3, u3} = {(1,0,1), (2,1,2), (1,2,2)} 


(a) Find the change-of-basis matrix P from the basis E to the basis S. 
Because Æ is the usual basis, we can immediately write each basis element of S as a linear combination of 
the basis elements of E. Specifically, 
u = (1,0,1) = e + ez 1 2 1 
Uy = (2,1,2) = 2e, + e, + 2e; and hence, P=/]0 1 2 
uz, = (1,2,2) = e, +2e, + 2e, 12 2 


Again, the coordinates of u, u, uz appear as the columns in P. Observe that P is simply the matrix whose 
columns are the basis vectors of S. This is true only because the original basis was the usual basis Æ. 


(b) Find the change-of-basis matrix Q from the basis S to the basis Æ. 
The definition of the change-of-basis matrix Q tells us to write each of the (usual) basis vectors in E as a 
linear combination of the basis elements of S. This yields 


e; = (1,0,0) = —2u, + 2u, — u3 -2 -2 3 
e, = (0,1,0) = —2u, + w and hence, Q=| 2 1 -2 
e; = (0,0,1) = 3u — 2u + u -1 0 1 


We emphasize that to find Q, we need to solve three 3 x 3 systems of linear equations—one 3 x 3 system for 
each of e}, e2, e3. 
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Alternatively, we can find Q = P7! by forming the matrix M = [P,/| and row reducing M to row 
canonical form: 


12110 0 1 0 0 -2 -2 3 
M=|0 1201 o0oļ~]|o0 10 2 1 -2)=[,P7 
122001 001-1 0 1 


=9 2 3 
thus, QO=P'=| 2 1 -2 
-1 0 if 


(Here we have used the fact that Q is the inverse of P.) 


The result in Example 6.6(a) is true in general. We state this result formally, because it occurs often. 


PROPOSITION 6.5: The change-of-basis matrix from the usual basis E of K” to any basis S of K” is 
the matrix P whose columns are, respectively, the basis vectors of S. 


Applications of Change-of-Basis Matrix 


First we show how a change of basis affects the coordinates of a vector in a vector space V. The 
following theorem is proved in Problem 6.22. 


THEOREM 6.6: Let P be the change-of-basis matrix from a basis S to a basis S’ in a vector space V. 
Then, for any vector v € V, we have 


Plule = [vls and hence,  P~'[o]s = [vy 


Namely, if we multiply the coordinates of v in the original basis S by P~!, we get the coordinates of v 
in the new basis S”. 


Remark 1: Although P is called the change-of-basis matrix from the old basis S to the new basis 
S', we emphasize that P~! transforms the coordinates of v in the original basis S into the coordinates of v 
in the new basis S”. 


Remark 2: Because of the above theorem, many texts call Q = P™', not P, the transition matrix 
from the old basis S to the new basis S’. Some texts also refer to Q as the change-of-coordinates matrix. 


We now give the proof of the above theorem for the special case that dim V = 3. Suppose P is the 
change-of-basis matrix from the basis S = {u,,u,u3} to the basis S’ = {v,, vy, v3}; say, 


UV, = QU, T a22 T A303 a b cy 
U2 = biu byu bzu3 and hence, P = d b, C2 
U3 = CyUy T CUa T C3U3 a, b} c3 


Now suppose v € V and, say, v = k,v, + kav + k3v3. Then, substituting for v,, v, v, from above, we 
obtain 


v= klau } duy } azuz ) t ky(byuy } byu H b3u3) } kz (c11 + Cau + c3u3) 
= (ayky + bik, + cikz)u; + (azk; + baka + czk3)uz + (azk; + bzko + c3k3)u3 
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Thus, 
ky ayk, + bik + cik 
[uly = | k and [u]s = | azki + doky + c2k3 
ks a3k, + b3ky + c3ks 
Accordingly, 
a b cy ky aiki + bik + cik 
Pluly: =a, b c ky | = | akı + bzh, + ck | = [uls 
a, by c3 kz azk; + b3ky + c3k3 


Finally, multiplying the equation [v], = P[v]s, by P-', we get 
P"[ols = P'Pluly = I[o]y = [rls 


The next theorem (proved in Problem 6.26) shows how a change of basis affects the matrix 
representation of a linear operator. 


THEOREM 6.7: Let P be the change-of-basis matrix from a basis S to a basis S’ in a vector space V. 
Then, for any linear operator T on V, 


[T] = Pa sP 


That is, if A and B are the matrix representations of T relative, respectively, to S and 
S’, then 


B = P'AP 


EXAMPLE 6.7 Consider the following two bases of R°: 


E = {e,,¢@,e,} ={(1,0,0), (0,1,0), (0,0,1)} 
and S = {u;,u2,u3} = {(1,0,1), (2,1,2), (1,2,2)} 


The change-of-basis matrix P from Æ to S and its inverse P~! were obtained in Example 6.6. 


(a) Write v = (1,3,5) as a linear combination of u, uz, u3, or, equivalently, find [v],. 
One way to do this is to directly solve the vector equation v = xu, + yu, + zuz; that is, 


1 1 2 1 x+2y+ z=1 
3] =x| 0| +y} 1] +2) 2 or y+2z=3 
5 1 2 2 x+2y+2z=5 


The solution is x=7, y=-—5, z=4, so v= 7u; — 5u, + 4us3. 
On the other hand, we know that [v]; = [1,3, 5], because Æ is the usual basis, and we already know P~!. 
Therefore, by Theorem 6.6, 


—2 —2 3 1 7 
[uly = P! [o]; = 2 1 2/13] =|-—s 
=] 0 1 3 4 
Thus, again, v= Tu = Suz + 4u3. 
1 3 -2 
(b) Let A= |2 —4 1 |, which may be viewed as a linear operator on R°. Find the matrix B that represents A 
3 -1 2 


relative to the basis S. 
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The definition of the matrix representation of A relative to the basis S tells us to write each of A(u,), A(u2), 
A(u3) as a linear combination of the basis vectors u, uz, u3 of S. This yields 


A(u,) = (—1,3,5) = Llu; — 5u, + 6u; 11 21 17 
A(u) = (1,2,9) = 21u; — 14u + 8u and hence, B= | -5 -14 -8 
A(u3) = (3, —4, 5) = 17u; = 8e, oa 2u; 6 8 2 


We emphasize that to find B, we need to solve three 3 x 3 systems of linear equations—one 3 x 3 system for 
each of A(u,), A(uz), A(u3). 
On the other hand, because we know P and P~!, we can use Theorem 6.7. That is, 


=2, t= 2 3 1 3 -2 12 1 11 21 17 
B = P'AP = 2 1 —2||2 —4 1 0 1 2|=]|-—5 -14 -8 
—1 0 1 3 -1 2 1 2 2 6 8 2 


This, as expected, gives the same result. 


6.4 Similarity 


Suppose A and B are square matrices for which there exists an invertible matrix P such that B = P~'AP; 
then B is said to be similar to A, or B is said to be obtained from A by a similarity transformation. We 
show (Problem 6.29) that similarity of matrices is an equivalence relation. 

By Theorem 6.7 and the above remark, we have the following basic result. 


THEOREM 6.8: Two matrices represent the same linear operator if and only if the matrices are 
similar. 


That is, all the matrix representations of a linear operator T form an equivalence class of similar 
matrices. 

A linear operator T is said to be diagonalizable if there exists a basis S of V such that T is represented 
by a diagonal matrix; the basis S is then said to diagonalize T. The preceding theorem gives us the 
following result. 


THEOREM 6.9: Let A be the matrix representation of a linear operator T. Then T is diagonalizable 
if and only if there exists an invertible matrix P such that P~'AP is a diagonal 
matrix. 


That is, T is diagonalizable if and only if its matrix representation can be diagonalized by a similarity 
transformation. 

We emphasize that not every operator is diagonalizable. However, we will show (Chapter 10) that 
every linear operator can be represented by certain ‘‘standard’’ matrices called its normal or canonical 
forms. Such a discussion will require some theory of fields, polynomials, and determinants. 


Functions and Similar Matrices 


Suppose f is a function on square matrices that assigns the same value to similar matrices; that is, 
J (A) =f(B) whenever A is similar to B. Then f induces a function, also denoted by f, on linear operators 
T in the following natural way. We define 


F(T) =f(I7Is) 


where S is any basis. By Theorem 6.8, the function is well defined. 
The determinant (Chapter 8) is perhaps the most important example of such a function. The trace 
(Section 2.7) is another important example of such a function. 
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EXAMPLE 6.8 Consider the following linear operator F and bases E and S of R?: 


F(x,y) = (2x + 3y, 4x — 5y), E={(1,0), (0,1)}, S={(1,2), @,5)} 


By Example 6.1, the matrix representations of F relative to the bases E£ and S are, respectively, 


2 3 52 129 
A=; E] a e c 


Using matrix A, we have 


(i) Determinant of F = det(A) = —10 — 12 = —22; (ii) Trace of F = tr(A) =2-—5=-3. 


On the other hand, using matrix B, we have 


(i) Determinant of F = det(B) = —2860 + 2838 = —22; (ii) Trace of F = tr(B) = 52 — 55 = -3. 


As expected, both matrices yield the same result. 


6.5 Matrices and General Linear Mappings 


Last, we consider the general case of linear mappings from one vector space into another. Suppose V and 
U are vector spaces over the same field K and, say, dim V = m and dim U = n. Furthermore, suppose 


S = 1015 Wiss Um} and S' = {tips Uz,- , Up} 
are arbitrary but fixed bases, respectively, of V and U. 


Suppose F: V — U is a linear mapping. Then the vectors F(v,), F(v2), ..., F(v,) belong to U, 
and so each is a linear combination of the basis vectors in S’; say, 


F(v) = Aju T Aju Ts T AlnUn 
F(v) = AU T A22 T°" T AnUn 
F (Um) = Amı“ Qm2U2 Amnn 


DEFINITION: The transpose of the above matrix of coefficients, denoted by ms (F) or [Fs s» is 
called the matrix representation of F relative to the bases S and S’. [We will use the 
simple notation m(F) and [F] when the bases are understood.] 


The following theorem is analogous to Theorem 6.1 for linear operators (Problem 6.67). 
THEOREM 6.10: For any vector v € V, [F]; 9 [uly = [F(v)]|y- 


That is, multiplying the coordinates of v in the basis S of V by [F], we obtain the coordinates of F (v) 
in the basis S’ of U. 

Recall that for any vector spaces V and U, the collection of all linear mappings from V into U is a 
vector space and is denoted by Hom(V, U). The following theorem is analogous to Theorem 6.2 for linear 
operators, where now we let M = M,,,,, denote the vector space of all m x n matrices (Problem 6.67). 


m,n 


THEOREM 6.11: The mapping m:Hom(V,U)— M defined by m(F) = [|F] is a vector space 
isomorphism. That is, for any F, G € Hom(V, U) and any scalar k, 
© m(F+G)=m(F)+m(G) or [F + G] = [F] + [C] 
Gi) m(kF) = km(F) or [kF] = k[F] 
(iii) m is bijective (one-to-one and onto). 
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Our next theorem is analogous to Theorem 6.3 for linear operators (Problem 6.67). 
THEOREM 6.12: Let S,S’,S” be bases of vector spaces V, U, W, respectively. Let F: V — U and 
GoU — W be linear mappings. Then 
[G ° Fs gn = [G] s s [F]s.9 
That is, relative to the appropriate bases, the matrix representation of the composition of two 
mappings is the matrix product of the matrix representations of the individual mappings. 


Next we show how the matrix representation of a linear mapping F: V — U is affected when new 
bases are selected (Problem 6.67). 


THEOREM 6.13: Let P be the change-of-basis matrix from a basis e to a basis e’ in V, and let Q be 
the change-of-basis matrix from a basis f to a basis f’ in U. Then, for any linear 
map F:V — U, 


Flep = 0 Fler? 


In other words, if A is the matrix representation of a linear mapping F relative to the bases e and f, 
and B is the matrix representation of F relative to the bases e’ and f’, then 
B = Q`'AP 
Our last theorem, proved in Problem 6.36, shows that any linear mapping from one vector space V 


into another vector space U can be represented by a very simple matrix. We note that this theorem is 
analogous to Theorem 3.18 for m x n matrices. 


THEOREM 6.14: LetF:V — U be linear and, say, rank(F) = r. Then there exist bases of V and U 
such that the matrix representation of F has the form 


where Z, is the r-square identity matrix. 


The above matrix A is called the normal or canonical form of the linear map F. 


SOLVED PROBLEMS 


Matrix Representation of Linear Operators 
6.1. Consider the linear mapping F:R? — R? defined by F(x,y) = (3x + 4y, 2x-— 5y) and the 
following bases of R°: 
E = {e,e} = {(1,0), (0,1)} and S = {u , u} = {(1,2), (2,3)} 


(a) Find the matrix A representing F relative to the basis £. 
(b) Find the matrix B representing F relative to the basis S. 


(a) Because Æ is the usual basis, the rows of A are simply the coefficients in the components of F(x, y); that 
is, using (a,b) = ae, + be,, we have 


F(e,) = F(1,0) = (3,2) = 3e, +2e, {3 4 
Keer hated, ose, “RP Ao cs 


Note that the coefficients of the basis vectors are written as columns in the matrix representation. 
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(b) First find F(u) and write it as a linear combination of the basis vectors u, and u. We have 
x+2y= 11 


F(u) = F(1,2) = (11, -8) = x(1,2) + (2, 3), and so ays 


Solve the system to obtain x = —49, y = 30. Therefore, 
F(u) = —49u, + 30u, 


Next find F(u) and write it as a linear combination of the basis vectors uw, and u. We have 


7 z = = x+2y= 18 
F(u) = F(2,3) = (18,-11) = x(1,2) + y(2,3), and so iyii 
Solve for x and y to obtain x = —76, y = 47. Hence, 
F(u) = —76u, + 47u, 
; ; —49 —76 
Write the coefficients of u; and u, as columns to obtain B = 30 47 
(b’) Alternatively, one can first find the coordinates of an arbitrary vector (a, b) in R’ relative to the basis S. 
We have 
= = x+2y=a 
(a,b) = x(1,2) + y(2,3) = (x+ 2y, 2x + 3y), and so 2x +3y=b 


Solve for x and y in terms of a and b to get x = —3a + 2b, y = 2a — b. Thus, 
(a,b) = (—3a + 2b)u, + (2a — b)uy 


Then use the formula for (a,b) to find the coordinates of F(u,) and F(u) relative to S: 


—49 —76 
30 47 


F(u) = F(1,2) = (11,—8) = —49u, + 30u, 


PO =F0.2(08.<1)=4a, N 2= | 


6.2. Consider the following linear operator G on R? and basis S: 
G(x, y) = (2x — Ty, 4x + 3y) and S = {u, u} = {(1,3), (2,5)} 


(a) Find the matrix representation [G]; of G relative to S. 
(b) Verify [G],[vl = [G(v)], for the vector v = (4, —3) in R?. 


First find the coordinates of an arbitrary vector v = (a,b) in R? relative to the basis S. We 


have 
a 1 2 x+2y=a 
a=] +9] 5] and so ae 
Solve for x and y in terms of a and b to get x = —Sa+2b, y=3a—b. Thus, 
(a,b) = (—5a + 2b)u, + (3a — b)uy, and so [u] =([-Sa+2b, 3a-— b)" 


(a) Using the formula for (a,b) and G(x, y) = (2x — 7y, 4x + 3y), we have 


G(u;) = G(1,3) 
5 


1 (—19, 13) = 121, — 70u 
G(uy) = G(2, ) 


— | 121 201 
(—31,23) = 201u, — 116m and so [G]s = | | 


—70 —116 


(We emphasize that the coefficients of u; and u, are written as columns, not rows, in the matrix representation.) 
(b) Use the formula (a,b) = (—5a + 2b)u, + (3a — b)uy to get 


v= (4,-3) = —26u, + 15u 
G(v) = G(4, —3) = (20,7) = —131u, + 80u, 


Then [uly =[-26,15]’ and —[G(v)], = [—131, 80]” 


CHAPTER 6 Linear Mappings and Matrices — A 


6.3. 


6.4. 


6.5. 


Accordingly, 


os- [2] [18] 0 


(This is expected from Theorem 6.1.) 


Consider the following 2 x 2 matrix A and basis S of R°: 


al and s={u.u}={|_], [-3]) 


The matrix A defines a linear operator on R°. Find the matrix B that represents the mapping A 
relative to the basis S. 
First find the coordinates of an arbitrary vector (a, b)” with respect to the basis S. We have 


aj _ 1 3 x+3y=a 
H =x 3| +f r 2%- Ty =b 
Solve for x and y in terms of a and b to obtain x = 7a + 3b, y = —2a — b. Thus, 
(a,b)? = (Ta + 3b)u, + (—2a — b)uy 


Then use the formula for (a, b)” to find the coordinates of Au, and Au, relative to the basis S: 


2 4 1 —6 
Au, = 5 6||-2| | —7 = —63u, + 19u, 


A | a 22) _ 235 A 
Uz = = == u u 
2 [s5 6ļ|-7 -27 | ! 2 


Writing the coordinates as columns yields 


Find the matrix representation of each of the following linear operators F on R? relative to the 
usual basis E = {e}, e3, e3} of R°; that is, find [F] = [F];: 


(a) F defined by F(x,y,z) = (x + 2y — 3z, 4x — 5y — 6z, 7x + 8y + 92). 


1 1 1 
(b) F defined by the 3 x 3 matrix A= |2 3 4 
5 5 5 


(c) F defined by F(e,) = (1,3,5), F (e2) = (2,4, 6), F(e3) = (7,7, 7). (Theorem 5.2 states that a 
linear map is completely defined by its action on the vectors in a basis.) 


(a) Because Æ is the usual basis, simply write the coefficients of the components of F (x,y,z) as rows: 
1 2. =3 
[F]= 14 -5 -6 


(b) Because Æ is the usual basis, [F] = A, the matrix A itself. 


(c) Here 
F(e,) = (1,3,5) = e; + 3e, + 563 127 
F(e,) = (2,4, 6) = 2e; + 4e, + 6e3 andso [F] = : 4 d 
F(e3) = (7, cf 7) = == Te + Te, + Te; 5 6 7 


That is, the columns of [F] are the images of the usual basis vectors. 


Let G be the linear operator on R? defined by G(x, y, z) = (2y +z, x —4y, 3x). 


(a) Find the matrix representation of G relative to the basis 
S = {w,,w,,w3} = {(1, 1,1), (1,1,0), (1,0,0)} 


(b) Verify that [G]|u] = [G(v)] for any vector v in R°. 
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First find the coordinates of an arbitrary vector (a, b,c) € R? with respect to the basis S. Write (a,b,c) as 
a linear combination of w, w3, w; using unknown scalars x,y, and z: 
(a,b,c) = x(1,1,1) +y(1, 1,0) +2(1,0,0) = (x+y +z, x+y, x) 
Set corresponding components equal to each other to obtain the system of equations 
x+y+z=a, x+y=b, x= 
Solve the system for x,y, z in terms of a,b, c to find x=c, y=b—c, z=a-—b. Thus, 
(a,b,c) = cw, + (b—c)w, + (a — b)w, or equivalently, [(a,b,c)] = [c, b— c, a—b]" 
(a) Because G(x, y,z) = (2y +z, x— 4y, 3x), 


G(w,) = G(1,1,1) = (3, —3,3) = 3w; — 6x, + 6x3 
G(w2) = G(1, 1,0) = (2, —3,3) = 3w; — 6w, + 5w; 
G(w3) = G(1,0,0) = (0,1,3) = 3w; — 2w, — w; 


Write the coordinates G(w,), G(w2), G(w3) as columns to get 


3 3 3 
IG]= | -6 -6 -2 
6 5 -l 


(b) Write G(v) as a linear combination of w, wz, w3, where v = (a,b,c) is an arbitrary vector in RÊ, 


G(v) = G(a, b,c) = (2b +c, a — 4b, 3a) = 3aw, + (—2a — 4b)w, + (—a + 6b + c)w, 


or equivalently, 


Accordingly, 


3a 
—2a — 4b | = [G(v)] 


5 -l —a+6b+c 


6.6. Consider the following 3 x 3 matrix A and basis S of R?: 


1 -2 1 1 0 1 
A= 3 —1 0 and S = {uy, Uz, U3 } = 1 5 1 5 2 
1 4 -2 1 1 3 


The matrix A defines a linear operator on RÌ. Find the matrix B that represents the mapping A 
relative to the basis S. (Recall that A represents itself relative to the usual basis of R°.) 
First find the coordinates of an arbitrary vector (a,b,c) in R? with respect to the basis S. We have 


a 1 0 1 x+ z=a 
b| =x|1|+y|1| +z|2 or x+y+2z=b 
c 1 1 3 x+y+3z=c 


Solve for x,y,z in terms of a,b,c to get 


x=a+b-c, y= -a+2b-—c, z=c-—b 
thus, (a,b,c)’ = (a+b c)u, + (—a + 2b — c)u, + (c — b)u; 
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Then use the formula for (a, b, c)" to find the coordinates of Au,, Au), Au; relative to the basis S: 


A(u) = A(1,1, 1)" = (0,2,37 = =u, +u + u 2 cd 8 
Alm) = A(1, 1,0)" = (—1,-1,2)" = —4u, —3u)+3u, so B=] 1 -3 -1 
A(u3) = A(1,2,3)" = (0, 1,3)" = —2u, — t + 2u, 1 3 2 


6.7. For each of the following linear transformations (operators) L on R, find the matrix A that 
represents L (relative to the usual basis of R°): 


(a) L is defined by L(1,0) = (2,4) and L(0, 1) = (5,8). 
(b) L is the rotation in R? counterclockwise by 90°. 
(c) L is the reflection in R? about the line y= =x. 


(a) Because {(1,0), (0,1)} is the usual basis of R?, write their images under L as columns to get 


2 5 
ara) 
(b) Under the rotation L, we have L(1,0) = (0,1) and Z(0, 1) = (—1,0). Thus, 
0 -l 
aei a 
(c) Under the reflection L, we have L(1,0) = (0,—1) and L(0, 1) = (—1,0). Thus, 
0 =l 
a= a 


6.8. The set S = {e*, te", Pe*} is a basis of a vector space V of functions f: R — R. Let D be the 
differential operator on V; that is, D( f) = df /dt. Find the matrix representation of D relative to 
the basis S. 
Find the image of each basis function: 


D(e*) = 3e! =3 
D(te*) =e%4+3te% =1( 
D(e*’) = 2te* + 3Pe* =0 


e ( 310 
e) + 3 (te) + 0(Pe*) and thus, [D}=|0 3 2 
+ 2( 0 0 3 


6.9. Prove Theorem 6.1: Let T: V — V bea linear operator, and let S be a (finite) basis of V. Then, for 
any vector v in V, [T].[vly = [T(v)]s. 
Suppose S = {u,,u,...,u,,}, and suppose, for i=1,...,n, 


n 
T (uj) = aju + apt + +++ + Gin, = E ay 
J= 
Then [7], is the n-square matrix whose jth row is 
(aij, zj- - - Any) (1) 
Now suppose 
n 
v= kyu, + kou +--+ + k,n = X kiu; 
i=l 
Writing a column vector as the transpose of a row vector, we have 


lvls = [Fy kaye kp] (2) 


2 »—_— CHAPTER 6 Linear Mappings and Matrices 


6.10. 


6.11. 


Furthermore, using the linearity of 7, 


T(v) = K 4 = SD kT (u) = Eo 


i=1 i=l 


uj = 3 (ayjky + azk Pee anjka Uj 
j= 


IM: 
K ` 
Il Ms 1p 
Qa 
Sa 
<4 
= 


Thus, [7(v)]s is the column vector whose jth entry is 
ajkai i azk apaia ayjk, (3) 


On the other hand, the jth entry of [7],[v], is obtained by multiplying the jth row of [7], by [v],;—that is 
(1) by (2). But the product of (1) and (2) is (3). Hence, [T],[v], and [7(v)], have the same entries. Thus, 


[T]sluls = [T(v)]s- 


Prove Theorem 6.2: Let S = {u),u>,...,u,} be a basis for V over K, and let M be the algebra of 
n-square matrices over K. Then the mapping m: A(V) — M defined by m(T) = [T], is a vector 
space isomorphism. That is, for any F, G € A(V) and any k € K, we have 


(i) [F+G] = [F] + [G], Gi) [kF] = k[F], (iii) m is one-to-one and onto. 
(i) Suppose, for i= 1,...,n, 


F(u;) = do agu; and G(u;) = D0 byw; 


j=l j=l 
Consider the matrices A = [a;] and B = [b;;]. Then [F] = A’ and [G] = B”. We have, for i = 1,...,n, 


(F + G)(u;) = F(u;) + Glu;) = X (ay + by) 
j=l 
Because 4 + B is the matrix (a, + bj), we have 


[F + G] = (4 + B)” = A7 +B" = [F] + [G] 
(ii) Also, for i = 1,...,n, 
(kF) (u;) = kF (u;) = k D ajju; = > (kay)u; 
Because kA is the matrix (ka;;), we have 
[kF] = (kA) = kA? = k[F] 


(iii) Finally, m is one-to-one, because a linear mapping is completely determined by its values on a basis. 
Also, m is onto, because matrix A = [a,j] in M is the image of the linear operator, 


F(u;) = >> aju; i=1,...,n 
1 


Thus, the theorem is proved. 


Prove Theorem 6.3: For any linear operators G, F € A(V), [Go F] = [G] [F]. 
Using the notation in Problem 6.10, we have 


(Go F)(u) = G(F(u;)) = a, 2 ay) = Lact) 
= Sa( 5 bat) = 9 e ay ba) Uk 


k=1 


Recall that 4B is the matrix AB = [c], where c = X2; ajjbj. Accordingly, 


[G o F] = (AB)’ = BTA" = [GIF] 


The theorem is proved. 
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6.12. Let A be the matrix representation of a linear operator T. Prove that, for any polynomial f(t), we 
have that f(A) is the matrix representation of f(T). [Thus, f(T) = 0 if and only if f(A) = 0.] 

Let ¢ be the mapping that sends an operator T into its matrix representation A. We need to prove that 

o( f(T)) =f (A). Suppose f(t) = a,t” + +--+ aıt + ag. The proof is by induction on n, the degree of f (t). 

Suppose n = 0. Recall that p(/') = Z, where /' is the identity mapping and Z is the identity matrix. Thus, 


P(F(T)) = Pol") = ag PI’) = aol = f(A) 


and so the theorem holds for n = 0. 
Now assume the theorem holds for polynomials of degree less than n. Then, because ¢ is an algebra 
isomorphism, 


b(f(T)) = pla, T" + a, )T" | +--+ aT + ag’) 
=a, 6(T)P(1"!) + lan T"! +--+ aT +a’) 
= a,AA"! + (ap 14"! +++ + aA + aol) = f(A) 


and the theorem is proved. 


Change of Basis 
The coordinate vector [v]; in this section will always denote a column vector; that is, 
T 
[uls = [a), az, sais a, 


6.13. Consider the following bases of R’: 
E = {e,,e} = {(1,9), (0, 1)} and S = {uu} = {(1,3), (1,4)} 


(a) Find the change-of-basis matrix P from the usual basis E to S. 
(b) Find the change-of-basis matrix Q from S back to E. 
(c) Find the coordinate vector [v] of v = (5,—3) relative to S. 


(a) Because Æ is the usual basis, simply write the basis vectors in S as columns: P = k a 


(b) Method 1. Use the definition of the change-of-basis matrix. That is, express each vector in E as a 
linear combination of the vectors in S. We do this by first finding the coordinates of an arbitrary vector 
v = (a,b) relative to S. We have 


x+ y=a 


(ab =x(1,3) +L etay) oo gL, 


Solve for x and y to obtain x = 4a — b, y = —3a + b. Thus, 
v = (4a — b)u, + (—3a + b)uy and [uly = [(a, b)]5 = [4a — b, —3a +5)" 
Using the above formula for [v]; and writing the coordinates of the e; as columns yields 


e, = (1,0) = 4u, — 3u — 4 -l 
e= (0,1) =-u, + tn aud Q -3 1 


Method 2. Because Q = P™!, find P~!, say by using the formula for the inverse of a 2 x 2 matrix. 
Thus, 
4 -1 
= 
age ae 


(c) Method 1. Write v as a linear combination of the vectors in S, say by using the above formula for 
v= (a,b). We have v = (5, —3) = 23u, — 18u, and so [vs = [23, —18]’. 
Method 2. Use, from Theorem 6.6, the fact that [v]y = P~'[u],, and the fact that [v]; = [5, —3]’: 


aara ILS] ER 
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6.14. The vectors u, = (1,2,0), u = (1,3,2), u, = (0,1,3) form a basis S of RÌ. Find 
(a) The change-of-basis matrix P from the usual basis E = {e), e3, e3} to S. 


(b) The change-of-basis matrix Q from S back to E. 110 
(a) Because Æ is the usual basis, simply write the basis vectors of S as columns: P= |2 3 1 
0 2 3 
(b) Method 1. Express each basis vector of E as a linear combination of the basis vectors of S by first 
finding the coordinates of an arbitrary vector v = (a,b,c) relative to the basis S. We have 


a 1 1 0 x+ y =a 
b| =x|2|+y|3| +z|1 or 2x+3y+ z=b 
c 0 2 3 2y+3z=c 


Solve for x,y,z to get x = 7a—3b+c, y= —6a +3b-— c, z=4a—2b+c. Thus, 
v = (a,b,c) = (Ja — 3b + c)u, + (—6a + 3b — c)un + (4a — 2b + c)u; 


or [uly = [(a,b,0)]y = [Ja — 3b +c, —6a +3b— c, 4a — 2b + c)" 

Using the above formula for [v]; and then writing the coordinates of the e; as columns yields 
e; =(1,0,0) = Tu; — 6u, + 4u; 7 =3 1 
a= (0, 1, 0) = —3u; + 3u = 2u3 and Q = —6 3 -1 
e, = (0,0,1) = w- u+ wy; 4 -2 1 


Method 2. Find P~! by row reducing M = [P, 7] to the form [Z, P7!]: 


1 1 0'10 0 1 10/10 0 
M=|2 3 110 1 O|~|0 1 11-2 1 0 
0231001 023,00 1 
11 0' 1 00 10 0! oes 1 
~{0 1 11-2 1 O]xs]0 1 01-6 3 -1] = [P7] 
0 0 1; 4 -2 1 0 0 1i 4 -2 1 
7 3 1 
Thus, Q= P! = | -6 3 -1l 
4 2 1 


6.15. Suppose the x-axis and y-axis in the plane R? are rotated counterclockwise 45° so that the new 
x’-axis and y’-axis are along the line y = x and the line y = —x, respectively. 


(a) Find the change-of-basis matrix P. 


(b) Find the coordinates of the point A(5,6) under the given rotation. 
(a) The unit vectors in the direction of the new x’- and y’-axes are 


u, = (t V2, 1v2) and u, = (— 1V2, 4 v2) 


(The unit vectors in the direction of the original x and y axes are the usual basis of R?.) Thus, write the 
coordinates of uw, and u, as columns to obtain 


im 1 
pal? 2 —53V2 
2 
(b) Multiply the coordinates of the point by P7!: 


pv2 4v2|f5 4v2 
E A | 


(Because P is orthogonal, P~! is simply the transpose of P.) 
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6.16. The vectors u; = (1,1,0), u = (0, 1, 1), u, = (1,2,2) form a basis S of R°. Find the coordinates 


6.17. 


of an arbitrary vector v = (a,b,c) relative to the basis S. 


Method 1. Express v as a linear combination of u, u2, uz using unknowns x,y,z. We have 
(a,b,c) =x(1,1,0) + y(0, 1,1) +2(1,2,2) = (x +z, x+y + 2z, y+ 2z) 
this yields the system 


x+ Z=a x+ Z=a x+ z=a 
x+y+2z=b or y+ z=—a+b or y+z=-—a+b 
y+2z=c y+2z=c z=a-—b+c 


Solving by back-substitution yields x =b—c, y= —2a+2b-—c, z=a—b+c. Thus, 
[uly = [b — c, —2a+2b-—c, a—b4 c)” 


Method 2. Find P~! by row reducing M = [P,/| to the form [7,P~'], where P is the change-of-basis 
matrix from the usual basis E to S or, in other words, the matrix whose columns are the basis vectors of S. 


We have 
10 11100 0 1' 10 0 
M=|1 1 2;0 1 0f» 1 1;-1 1 0 
01 2'00 1 012} 001 
101 1 00 1 0 0! 0 1 -l 
alori i olele ip 21 = [Z,P7'] 
0 0 1! 1 —1 1 OO 1! 1 -1 1 
0 1 <1 0 1 =-1]fa b-c 
Thus, P'=|-2 2 —1| and [i]s = Pli] = |-2 2 -1]|b| = |—2a+2b-c 

1 —1 1 1 -l1 1 c a—b+c 


Consider the following bases of R?: 
S= {uy u} = {(1,=2); (3,—4)} and S = {v1 v2} = {(1,3), (3,8)} 


(a) Find the coordinates of v = (a, b) relative to the basis S. 

(b) Find the change-of-basis matrix P from S to S’. 

(c) Find the coordinates of v = (a,b) relative to the basis S’. 

(d) Find the change-of-basis matrix Q from S’ back to S. 

(e) Verify Q = Po. 

(f) Show that, for any vector v = (a,b) in R?, P![u], = [u]y. (See Theorem 6.6.) 


(a) Let v = xu, + yu, for unknowns x and y; that is, 


a} _ 1 3 x+3y=a x+3y=a 
H nE tfal or -2x — 4y =b or eath 
Solve for x and y in terms of a and b to get x= —2a —żb and y=at+hb. Thus, 


(a,b) = (—2a — Ju; + (a+ 55)uy or [(a, b)|5 = [—2a — 3b, a+ 1b)” 


(b) Use part (a) to write each of the basis vectors v, and v, of S’ as a linear combination of the basis vectors 
u; and uy of S; that is, 
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6.18. 


6.19. 


Then P is the matrix whose columns are the coordinates of v; and v, relative to the basis S; that is, 


1B 
=b 3 
5 
5 Z 
(c) Let v = xv, + yv, for unknown scalars x and y: 


a| |l 3 x+3y=a x+3y=a 
HEBESH O 3x4 8y=b 0 -y=b-3a 


P= 


Solve for x and y to get x = —8a + 3b and y = 3a — b. Thus, 
(a,b) = (—8a + 3b)v; + (3a—b)v. or — [(a,b) |» = [-8a +3b, 3a-— b)” 


(d) Use part (c) to express each of the basis vectors u and u, of S as a linear combination of the basis 
vectors v; and v, of S’: 


u, = (1,—2) = (-8 — 6)v, + (3 + 2)v = — 14, + 5v 
Uy = (3, —4) = (—24 — 12)v, + (9 + 4) = —360, + 130, 


—14 —36 
> 13 |" 


Write the coordinates of u; and u, relative to S’ as columns to obtain Q = | 

_ [Hi -36]|-¥ -18| fı Oo] 
(e) or =| 5 l 3 a\=lo0 1) =? 
(f) Use parts (a), (c), and (d) to obtain 


3 
—2a — 5b 
a+}b 


miea T] 


—8a+ 3b 
~ | 3a— b | = [uly 
Suppose P is the change-of-basis matrix from a basis {u;} to a basis {w;}, and suppose Q is the 
change-of-basis matrix from the basis {w;} back to {u;}. Prove that P is invertible and that 


OP), 


Suppose, for i = 1,2,...,n, that 


Wi = Aiu + jug +... + Aint = } aju; (1) 
j=l 
and, for j = 1,2,...,n, 
u; = ba Wi + byw, +++ + DigWy = 9 DW (2) 
k=l 


Let A = [a,| and B = [by]. Then P = A” and Q = B”. Substituting (2) into (1) yields 
w=>> 7103 baw) = 2 (x aybu) Wwy 
ANE 


Because {w;} is a basis, X- a;b; = 0;, where ô; is the Kronecker delta; that is, ô = 1 ifi = k but 6, = 0 
if i Æ k. Suppose AB = [c]. Then c; = 6. Accordingly, AB = I, and so 


OP = BTA’ = (ABY =I" =1 
Thus, Q = P~!. 
Consider a finite sequence of vectors S = {u], u2, ..., Up}. Let S’ be the sequence of vectors 
obtained from S by one of the following ‘‘elementary operations’’: 
(1) Interchange two vectors. 
(2) Multiply a vector by a nonzero scalar. 
(3) Add a multiple of one vector to another vector. 


Show that S and S’ span the same subspace W. Also, show that S’ is linearly independent if and 
only if S is linearly independent. 
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6.20. 


6.21. 


6.22. 


Observe that, for each operation, the vectors S’ are linear combinations of vectors in S. Also, because 
each operation has an inverse of the same type, each vector in S is a linear combination of vectors in S”. 
Thus, S and S’ span the same subspace W. Moreover, S’ is linearly independent if and only if dim W = n, 
and this is true if and only if S is linearly independent. 


Let A = [a;] and B = [b;] be row equivalent m x n matrices over a field K, and let v,, v2,- . , Un 
be any vectors in a vector space V over K. For i = 1,2,...,m, let u; and w, be defined by 


Uj = Ay Vy F Aant + +++ + Ain Vy and w; = bgv + bat +--+ + DinUy 


Show that {u;} and {w;} span the same subspace of V. 

Applying an ‘‘elementary operation’’ of Problem 6.19 to {u;} is equivalent to applying an elementary 
row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence 
of elementary row operations. Hence, {w;} can be obtained from {u;} by the corresponding sequence of 
operations. Accordingly, {u;} and {w;} span the same space. 

Suppose u, Uz,..., uU, belong to a vector space V over a field K, and suppose P = [a,j] is an 

n-square matrix over K. For i= 1,2,...,n, let v; = apu + apt +--+ + Aj Uy. 

(a) Suppose P is invertible. Show that {u;} and {v;} span the same subspace of V. Hence, {u;} is 
linearly independent if and only if {v;} is linearly independent. 

(b) Suppose P is singular (not invertible). Show that {v;} is linearly dependent. 

(c) Suppose {v;} is linearly independent. Show that P is invertible. 

(a) Because P is invertible, it is row equivalent to the identity matrix J. Hence, by Problem 6.19, {v,} and 


{u;} span the same subspace of V. Thus, one is linearly independent if and only if the other is linearly 
independent. 


(b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means {v;} spans a 
substance that has a spanning set with less than n elements. Thus, {v,} is linearly dependent. 


(c) This is the contrapositive of the statement of part (b), and so it follows from part (b). 


Prove Theorem 6.6: Let P be the change-of-basis matrix from a basis S to a basis S’ in a vector 
space V. Then, for any vector v € V, we have P[o]y = [v]s, and hence, P~! [o]; = [uv] y. 
Suppose S = {u,,...,u,,} and S’ = {w,,...,w,}, and suppose, fori=1,...,n, 


n 
Wi = Ay Uy + anth +--+ + Ainiin = ) ail 
j=l 


Then P is the n-square matrix whose jth row is 


(aij, aaj,- ++ 5 Any) (1) 
Also suppose v = kyw; + kows +--+ + kw, = doy, kw; Then 
T 
[o]s = [ky , kz, pet k,] (2) 


Substituting for w; in the equation for v, we obtain 


v=} km = &( Zam) => (Sauk) 
= 


i=1 i=1 


=] (ai;jkı T azk prti Anjky Uj 


Accordingly, [v], is the column vector whose jth entry is 
Ayjky + azk + +++ + anjka (3) 


On the other hand, the jth entry of P[v],, is obtained by multiplying the jth row of P by [v]„—that is, (1) by 
(2). However, the product of (1) and (2) is (3). Hence, P[v], and [v] have the same entries. Thus, 
Plul, = [vu], as claimed. 

Furthermore, multiplying the above by P~! gives P~! [v]; = P™'Plu|y. = [o] y. 
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Linear Operators and Change of Basis 


6.23. 


6.24. 


Consider the linear transformation F on R? defined by F (x,y) = (5x —y, 2x+y) and the 
following bases of R°: 


E = {e;,e,} ={(1,0), (0,1)} and S = {u,,u} = {(1,4), (2,7)} 
(a) Find the change-of-basis matrix P from E to S and the change-of-basis matrix O from S back 
to E. 
(b) Find the matrix A that represents F in the basis £. 
(c) Find the matrix B that represents F in the basis S. 


(a) Because ŒE is the usual basis, simply write the vectors in S as columns to obtain the change-of-basis 
matrix P. Recall, also, that Q = P~'. Thus, 


P= F | and Q = P! = E E 


(b) Write the coefficients of x and y in F(x,y) = (5x — y, 2x + y) as rows to get 
5 -l 
aa 
(c) Method 1. Find the coordinates of F(u,) and Fu) relative to the basis S. This may be done by first 


finding the coordinates of an arbitrary vector (a,b) in R? relative to the basis S. We have 


x+2y=a 


(a,b) = x(1,4) + y(2,7) = (x +2y, 4x + 7y), and so Ae Ty =b 


Solve for x and y in terms of a and b to get x = —7a+2b, y=4a—b. Then 
(a,b) = (—7a + 2b)u, + (4a — b)uy 


Now use the formula for (a,b) to obtain 
F(u) = F(1,4) = (1,6) = 5u — 2u —_{ 5 1 
Flw) = F(2,7)= (3,1) = m+ my e B= 
Method 2. By Theorem 6.7, B = P~'AP. Thus, 


s-re 1S YE a- 


2 3 ; : ; : ; 
. Find the matrix B that represents the linear operator A relative to the basis 


4 -1 
S = {u u} = {[1,3]", [2,5]"}. [Recall A defines a linear operator A: R? — R? relative to the 
usual basis E of R°]. 


Method 1. Find the coordinates of A(u,) and A(uy) relative to the basis S by first finding the coordinates 
of an arbitrary vector [a, bl’ in R? relative to the basis S. By Problem 6.2, 


Let A= | 


[a, b]? = (—5a + 2b)u, + (3a — b)uy 


Using the formula for [a, |’, we obtain 


2 3)/1 11 
Alu) = 4 2g lg) =) a — 53u, + 32u 
2- 3)/2 19 
and A(u) = ae ||P eal i —89u, + 54u, 
—53 —89 
Thus, B = | | 
32 54 


Method 2. Use B = P~!AP, where P is the change-of-basis matrix from the usual basis E to S. Thus, 
simply write the vectors in S (as columns) to obtain the change-of-basis matrix P and then use the formula 
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6.25. 


6.26. 


for P-!. This gives 


p J 2 E i 
P= and P= 
3 5 3-1 


a |; s] p i i; | E P 
Then B=P AP = 
32 54 
1 3 1 


3 5]14 -l 
Let A= | 2 5 —4 |. Find the matrix B that represents the linear operator A relative to the 
1 -2 2 


basis 
S = {u , u, u3} = {[1,1,0]", [0,1,1]", ee 


[Recall A that defines a linear operator A: R? — R? relative to the usual basis E of R°.] 


Method 1. Find the coordinates of A(u,), A(u2), A(u3) relative to the basis S by first finding the 
coordinates of an arbitrary vector v = (a,b,c) in R° relative to the basis S. By Problem 6.16, 


[v]s = (b — c)u, + (—2a + 2b — c)u, + (a — b + cju 


Using this formula for [a, b, ae we obtain 
A(u,) = [4,7, =i = 8u, + Tuy — Suz, A(u) = f4, 1,0] = u; — 6u, + 3u 
A(u3) = [9,4,1]7 = 3u, — 11u + 6u 


Writing the coefficients of u , u2, uz as columns yields 


8 1 3 
B= 7 -6 -ll 
—5 3 6 


Method 2. Use B = P~'AP, where P is the change-of-basis matrix from the usual basis £ to S. The matrix 
P (whose columns are simply the vectors in S) and P~! appear in Problem 6.16. Thus, 


0 1 -1l 1 3 1 1 0 1 8 1 3 
—2 2 =j 42 5 —4|]|1 1 2) = 7 -6 -ll 
1 -l 1 1 =2 2j|{0 1 2 —5 3 6 


Prove Theorem 6.7: Let P be the change-of-basis matrix from a basis S to a basis S’ in a vector 
space V. Then, for any linear operator T on V, [T] = P7'[T]sP. 
Let v be a vector in V. Then, by Theorem 6.6, P[v]y = [v],. Therefore, 


P"'[T|sPlvly = PIT sls = P [Tw]; = [T(w)]y 
But [T] y [v]y = [T(v)]ş. Hence, 


B= PAP = 


P"'[T|sPla = [Tlsluly 


Because the mapping v+>[v], is onto K”, we have P~'[T];PX =[T],X for every X € K”. Thus, 
P7'[T],P = [T] ç, as claimed. 


Similarity of Matrices 


6.27. 


3 6 3 4 
(a) Find B = P-'AP. (b) Verify tr(B) = tr(A). (c) Verify det(B) = det(A). 


Let A = p 4 and P= f A 


(a) First find P~! using the formula for the inverse of a 2 x 2 matrix. We have 
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6.28. 


6.29. 


6.30. 


Then 
—2 1 4 -—2 1 2 25 30 
— p-l > = 
amid a=] 3 [3 lle i]= [2 E 


(b) tr(4) = 4+ 6 = 10 and tr(B) = 25 — 15 = 10. Hence, tr(B) = tr(A). 
(c) det(4) = 24 + 6 = 30 and det(B) = —375 + 405 = 30. Hence, det(B) = det(4). 


Find the trace of each of the linear transformations F on R° in Problem 6.4. 
Find the trace (sum of the diagonal elements) of any matrix representation of F such as the matrix 
representation [F] = [F]; of F relative to the usual basis E given in Problem 6.4. 


(a) tr(F) = tr([F]) =1-54+9=5. 
(b) tr(F) = tr([F]) =14+345=9. 
(c) tr(F) = tr([F]) = 1+4+7 = 12. 


Write 4 ~ B if A is similar to B—that is, if there exists an invertible matrix P such that 
A = P`! BP. Prove that ~ is an equivalence relation (on square matrices); that is, 


(a) A~ A, for every A. (b) IfA 7B, then B x A. 
(c) IfA ~B and BC, then A x C. 
(a) The identity matrix J is invertible, and 77! = J. Because A = J~'AJ, we have A & A. 


(b) Because AB, there exists an invertible matrix P such that A=P-'BP. Hence, 
B = PAP! = (P~')"'AP and P~! is also invertible. Thus, B ~ A. 


(c) Because A = B, there exists an invertible matrix P such that A = P~'BP, and as B ~ C, there exists an 
invertible matrix Q such that B = O-'COQ. Thus, 


A = P'BP = P"'(Q"'CQ)P = (P"'Q')C(QP) = (QP) 'C(QP) 
and QP is also invertible. Thus, 4 ~ C. 


Suppose B is similar to A, say B = P~'AP. Prove 

(a) B” = P-'A"P, and so B” is similar to A”. 

(b) f(B) = P~'f(A)P, for any polynomial f(x), and so f(B) is similar to f(A). 
(c) B is a root of a polynomial g(x) if and only if A is a root of g(x). 


(a) The proof is by induction on n. The result holds for n = 1 by hypothesis. Suppose n > 1 and the result 
holds for n — 1. Then 


B" = pe = (Pappa 'P)\ =P lap 


(b) Suppose f(x) = a,x" +---+ a,x + ag. Using the left and right distributive laws and part (a), we have 
P-'f(A)P = P7'(a,A" + +++ +.a,A + agl)P 
= P! (a,A")P + --- + P7!(a,A)P + P| (agl)P 
= a„(P7!A"P) + ----+a,(P'AP) + ao(P UP) 
= a„B” + --- + aB + al = f (B) 


(c) By part (b), g(B) = 0 if and only if P-'g(A)P = 0 if and only if g(4) = POP"! = 0. 


Matrix Representations of General Linear Mappings 


6.31. 


Let F: R? — R° be the linear map defined by F(x,y, z) = (3x + 2y — 4z, x — 5y + 3z). 
(a) Find the matrix of F in the following bases of R? and R?: 
S = {w,,w2,w3} = {(1,1,1), (1,1,0), (1,0,0)} and S' = {uu} = {(1,3), (2,5)} 
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(b) Verify Theorem 6.10: The action of F is preserved by its matrix representation; that is, for any 
v in R, we have [F]; 9 [v]s = [F(v)]y. 


(a) From Problem 6.2, (a,b) = (—5a + 2b)u, + (3a — b)uy. Thus, 


F(w,) = F(1,1,1) = (1,—1) = —7u; + 4u 

F(w2) = F(1,1,0) = (5, —4) = —33u, + 19u 

F(w3) = F(1,0,0) = (3,1) = — 13u; + 8u 
Write the coordinates of F(w,), F(w2), F(w3) as columns to get 


[4 p 
Fiss = | 4 19 ] 


(b) If v= (x,y,z), then, by Problem 6.5, v = zw, + (y — z)w, + (x — y)w3. Also, 


F(v) = (3x + 2y — 4z, x — 5y + 3z) = (—13x — 20y + 26z)u, + (8x + 11y — 15z)uy 
—13x — 20y + 26. 
Hence, lus = (z, y-z, x-y)" and [F(v)] oy = | ae | 


4 19 8j ei te |= Fos 


SF 33 zl í E 
=y 


6.32. Let F: R” — R” be the linear mapping defined as follows: 


FP Sig egy sos x ) a (ax; aps F Ay X p> ay 1X1 Ht Ay Xp, <o Am ee ee n) 


an 


(a) Show that the rows of the matrix |F] representing F relative to the usual bases of R” and R” 
are the coefficients of the x; in the components of F'(x,,...,x,). 


(b) Find the matrix representation of each of the following linear mappings relative to the usual 
basis of R”: 
(i) F:R* — R? defined by F(x,y) = (3x—y, 2x+4y, 5x-— 6y). 
(ii) F:R* — R? defined by F(x,y,s,t) = (3x — 4y + 2s — 5t, 5x +7y—s—2t). 
(iii) F:R* — R* defined by F(x,y,z) = (2x+3y—8z, x+y+z, 4x—5z, 6y). 


(a) We have 
F(1,0,...,0) = (a11, a21,- -< am) a ap --- an 
F(0,1,...,0) = (ay, a22,- - - , Am2) and thus, [F] = Q ay ~- An 
F(0, 0, E. 1) = (ain Mn++ rain) Ami Am ++ Amn 
(b) By part (a), we need only look at the coefficients of the unknown x,y, ... in F(x,y, ...). Thus, 
a 3-4 2 -5 i i i 
© m=j2 4 a msj Ff pa I=] o-s 
06 0 
2 5 -S3 : : 3 2 
6.33. Let A = L4 7V Recall that A determines a mapping F: R° — R° defined by F (v) = Av, 


where vectors are written as columns. Find the matrix [F] that represents the mapping relative to 
the following bases of R? and R?: 


(a) The usual bases of R? and of R°. 
(b) S = {w;, w2, w3} = {(1,1,1), (1,1,0), (1,0,0)} and S’ = {u;, u} = {(1,3), (2,5)}. 


(a) Relative to the usual bases, [F] is the matrix A. 
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(b) From Problem 9.2, (a,b) = (—5a + 2b)u, + (3a — b)uy. Thus, 


1 
F(w,) = eles 12u, +8 
= — = — u u 
m -4 7 4 D 
gj = 
= iS 
F(wy) es i Alu, +24 
Ww. = = = — u ti 
2 -4 7] | 3 1 2 
0] 
E M] | 
2 5 -3 2 
F(w3) = La 0| = Hi —8u, + 5u 
alo > 
ia ; ; —12 —41 -8 
Writing the coefficients of F(w,), F(w2), F(w3) as columns yields [F] = 8 a 5l 


6.34. Consider the linear transformation T on R? defined by T(x, y) = (2x — 3y, x+ 4y) and the 


6.35. 


6.36. 


following bases of R°: 
E = {e,,e} = {(1,0), (0,1)} and S = {u, u} = {(1,3), (2,5)} 


(a) Find the matrix A representing T relative to the bases E and S. 
(b) Find the matrix B representing T relative to the bases S and E. 
(We can view T as a linear mapping from one space into another, each having its own basis.) 


(a) From Problem 6.2, (a,b) = (—5a + 2b)u, + (3a — b)u,. Hence, 


NDS AOD o Cago ne i mio 4A=[7S i] 
(b) We have 
Mo na Cie pae «B= (Th | 
How are the matrices A and B in Problem 6.34 related? 


By Theorem 6.12, the matrices A and B are equivalent to each other; that is, there exist nonsingular 
matrices P and Q such that B = Q-'AP, where P is the change-of-basis matrix from S to E, and Q is the 
change-of-basis matrix from E to S. Thus, 


e= s [5 a) eL 5] 


; yael 23 ele T a 
E =la si| s5 -3ll3 s| [B 22 


Prove Theorem 6.14: Let F: V — U be linear and, say, rank(F') = r. Then there exist bases V and 
of U such that the matrix representation of F has the following form, where Z, is the r-square 
identity matrix: 


Suppose dim V = m and dim U = n. Let W be the kernel of F and U’ the e of F. We are given that 
rank (F) = r. Hence, the dimension of the kernel of F is m — r. Let {w,... } be a basis of the kernel 
of F and extend this to a basis of V: 


Wm—r 


{Uy 50005 Ups Wry ees Win rf 
Set uy = F(u), uy =F), ..., u, = F(v,) 


T 
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Then {u,,...,u,} is a basis of U’, the image of F. Extend this to a basis of U, say 


{Uy ye Ups Ups Un} 
Observe that 
Flv) =u, = lu + Ou, 4 + Ou, + Ou, 4 + Ou, 
F(v) =u = 0u; + lu 4 + Ou, + Ou, 4 + Ou, 
F(v,) =u, = Ou, + Ou, 4 + lu, + Ou,44 4 + Ou, 
F(w,) =0 =0u, + Ou +--+ Ou, + Ou,., +-+- + Ou, 
F(Wy_>) = 0 = Ou, + Ou, +--+ + Ou, + Ou,4; 4 + Ou, 


Thus, the matrix of F in the above bases has the required form. 


SUPPLEMENTARY PROBLEMS 


Matrices and Linear Operators 


6.37. 


6.38. 


6.39. 


6.40. 


6.41. 


6.42. 


6.43. 


Let F: R? — R? be defined by F(x,y) = (4x + 5y, 2x —y). 


(a) Find the matrix A representing F in the usual basis Æ. 

(b) Find the matrix B representing F in the basis S = {uw,,u.} = {(1,4), (2,9)}. 
(c) Find P such that B = P~'AP. 

(d) For v= (a,b), find [v]; and [F(v)],. Verify that [F],[v], = [F (v)]s. 


Let A: R? — R? be defined by the matrix A = f 4 ; 


(a) Find the matrix B representing A relative to the basis S = {u;, u2} = {(1,3), (2,8)}. (Recall that 4 
represents the mapping A relative to the usual basis £.) 


(b) For v= (a,b), find [v]; and [A(v)],. 


For each linear transformation L on R°, find the matrix A representing L (relative to the usual basis of R°): 


(a) L is the rotation in R? counterclockwise by 45°. 

(b) L is the reflection in R? about the line y = x. 

(c) Lis defined by L(1,0) = (3,5) and L(0, 1) = (7, —2). 
(d) L is defined by L(1,1) = (3,7) and L(1,2) = (5, —4). 


Find the matrix representing each linear transformation T on R° relative to the usual basis of R?: 


(a) T(x,y,Z) = (x,y, 0). (b) T(x,y, z) = (z, ytz, x +y +z). 
(c) T(x,y,z) = (2x — 7y — 4z, 3x +y + 4z, 6x— 8y +z). 


Repeat Problem 6.40 using the basis S = {u;, u2, u3} = {(1,1,0), (1,2,3), (1,3,5)}. 
Let L be the linear transformation on R? defined by 
L(1,0,0) = (1,1,1), L(0,1,0) = (1,3,5), L(0,0,1) = (2,2,2) 


(a) Find the matrix A representing L relative to the usual basis of R°. 
(b) Find the matrix B representing L relative to the basis S in Problem 6.41. 


Let D denote the differential operator; that is, D( f(¢)) = df /dt. Each of the following sets is a basis of a 
vector space V of functions. Find the matrix representing D in each basis: 


(a) fe',e*, te}. (b) {1,¢,sin3z, cos 3¢}. (c) {e te”, Ped. 
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6.44. 


6.45. 


6.46. 


Let D denote the differential operator on the vector space V of functions with basis S = {sin 0, cos 0}. 


(a) Find the matrix A = [D]. (b) Use A to show that D is a zero of f(t) =P +1. 


Let V be the vector space of 2 x 2 matrices. Consider the following matrix M and usual basis E of V: 


a b 1 0; |O 1 0 0 0 0 
mle a] m esile oble oh [tof [o a} 
Find the matrix representing each of the following linear operators T on V relative to E: 


(a) T(4)=MA. (Œ) T(A)=AM. (c) T(A)=MA-—AM. 


Let 1, and 0, denote the identity and zero operators, respectively, on a vector space V. Show that, for any 
basis S of V, (a) [1p], =Z, the identity matrix. (b) {0,], = 0, the zero matrix. 


Change of Basis 


6.47. 


6.48. 


6.49. 


6.50. 


6.51. 


Find the change-of-basis matrix P from the usual basis E of R? to a basis S, the change-of-basis matrix Q 
from S back to £, and the coordinates of v = (a,b) relative to S, for the following bases S: 


(a) S= {(1,2); (3,5)}. (c) S= {(2,5), (3,7)}. 
(b) S= {(1, —3), (3, —8)}. (d S= {(2,3), (4,5)}. 


Consider the bases S = {(1,2), (2,3)} and S’ = {(1,3), (1,4)} of R°. Find the change-of-basis matrix: 
(a) P from S to S. (b) Q from S' back to S. 


Suppose that the x-axis and y-axis in the plane R? are rotated counterclockwise 30° to yield new x’-axis and 
y’-axis for the plane. Find 

(a) The unit vectors in the direction of the new x'-axis and y’-axis. 

(b) The change-of-basis matrix P for the new coordinate system. 

(c) The new coordinates of the points A(1,3), B(2,—5), C(a, b). 


Find the change-of-basis matrix P from the usual basis E of R? to a basis S, the change-of-basis matrix Q 
from S back to E, and the coordinates of v = (a,b,c) relative to S, where S consists of the vectors: 

(a) u; = (1,1,0), u = (0,1,2),u; = (0,1, 1). 

(b) u = (1,0,1),u, = (1,1,2),u, = (1,2, 4). 

(c) u; = (1,2, 1), u = (1,3,4), uw, = (2,5, 6). 


Suppose S1, S2, S3 are bases of V. Let P and Q be the change-of-basis matrices, respectively, from S$, to Sy 
and from S, to $3. Prove that PQ is the change-of-basis matrix from Sı to $3. 


Linear Operators and Change of Basis 


6.52. 


6.53. 


Consider the linear operator F on R? defined by F(x,y) = (5x +y, 3x — 2y) and the following bases of R°: 
S= {(1,2), (2,3)} and S’ = {(1,3), (1,4)} 


(a) Find the matrix A representing F relative to the basis S. 
(b) Find the matrix B representing F relative to the basis S”. 
(c) Find the change-of-basis matrix P from S to S’. 

(d) How are Æ and B related? 


Let A:R? — R? be defined by the matrix A = k 


operator A relative to each of the following bases: (a) S = {(1, 3)", (2, 5)"}. (b) S={(1, 3)", (a1, 


“aI: Find the matrix B that represents the linear 
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6.54. Let F: R? — R? be defined by F(x,y) = (x — 3y, 2x — 4y). Find the matrix A that represents F relative to 
each of the following bases: (a) S = {(2,5), (3,7)}. (b) S= {(2,3), (4,5)}. 


3 1 
6.55. Let A:R? — R? be defined by the matrix A = 7 4|. Find the matrix B that represents the linear 
4 3 


1 
2 
1 
operator A relative to the basis S = {(1, 1, 1” (0, 1, 1)", (1,2,3)7}. 


Similarity of Matrices 
1 1 1 -2 
6.56. Let 4 = $ 33] and P = E A 
(a) Find B= P-!A4P. (b) Verify that tr(B) = tr(A). (c) Verify that det(B) = det(A). 
6.57. Find the trace and determinant of each of the following linear maps on R°: 
(a) F(x,y) = (2x—3y, Sxt+4y). (b) G(x, y) = (av + by, ex + dy). 


6.58. Find the trace and determinant of each of the following linear maps on R: 
(a) F(x,y,z) = (x+ 3y, 3x-— 2z, x-— 4y — 3z). 
(b) G(x,y,z) = (y + 3z, 2x — 4z, 5x + 7y). 


6.59. Suppose S = {u}; , u, } is a basis of V, and T: V — V is defined by T(u,) = 3u; — 2u, and T (u3) = u, + 4u. 
Suppose S’ = {w;, w2} is a basis of V for which w; = u; + u and w, = 2u, + 3u. 


(a) Find the matrices A and B representing T relative to the bases S and S’, respectively. 
(b) Find the matrix P such that B = P-!AP. 


6.60. Let A be a 2 x 2 matrix such that only A is similar to itself. Show that Æ is a scalar matrix, that is, that 
a 0 
= | i a 
6.61. Show that all matrices similar to an invertible matrix are invertible. More generally, show that similar 
matrices have the same rank. 


Matrix Representation of General Linear Mappings 

6.62. Find the matrix representation of each of the following linear maps relative to the usual basis for R”: 
(a) F:R? — R? defined by F(x,y,z) = (2x — 4y + 9z, 5x + 3y — 2z). 
(b) F:R? — R‘ defined by F(x,y) = (3x + 4y, 5x — 2y, x+ 7y, 4x). 
(c) F:R — R defined by F(x1,X2,xX3,x4) = 2x1 +x) — 7x3 — x4. 


6.63. Let G: R? — R? be defined by G(x, y, z) = (2x + 3y — z, 4x — y + 22). 


(a) Find the matrix A representing G relative to the bases 
S = {(1,1,0), (1,2,3), (1,3,5)} and S’ = {(1,2), (2,3)} 


(b) For any v = (a,b,c) in R?, find [v]; and [G(v)],. (c) Verify that Av], = [G(v)] y. 


6.64. Let H:R* — R? be defined by H(x,y) = (2x + 7y, x — 3y) and consider the following bases of R°: 
S= {(1,1), (1,2)} and S’ = {(1,4), (1,5)} 


(a) Find the matrix A representing H relative to the bases S and S’. 
(b) Find the matrix B representing H relative to the bases S’ and S. 
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6.65. 


6.66. 


6.67. 


Let F: R? — R? be defined by F(x,y, z) =(2x+y—z, 3x-— 2y + 4z). 


(a) Find the matrix A representing F relative to the bases 
S= {(1,1,1), (1,1,0), (1,0,0)} and S' = (1,3), (1,4)} 


(b) Verify that, for any v = (a,b,c) in R°, Alu], = [F(v)]y- 


Let S and S’ be bases of V, and let 1, be the identity mapping on V. Show that the matrix A representing 
1, relative to the bases S and S’ is the inverse of the change-of-basis matrix P from S to S’; that is, 
A=P, 


Prove (a) Theorem 6.10, (b) Theorem 6.11, (c) Theorem 6.12, (d) Theorem 6.13. [Hint: See the proofs 
of the analogous Theorems 6.1 (Problem 6.9), 6.2 (Problem 6.10), 6.3 (Problem 6.11), and 6.7 
(Problem 6.26).] 


Miscellaneous Problems 


6.68. 


6.69. 


6.70. 


Suppose F: V — V is linear. A subspace W of V is said to be invariant under F if F(W) C W. Suppose W is 


invariant under F and dim W = r. Show that F has a block triangular matrix representation M = k J 
where A is an r x r submatrix. 

Suppose V = U + W, and suppose U and V are each invariant under a linear operator F: V — V. Also, 
suppose dim U = r and dim W = S. Show that F has a block diagonal matrix representation M = [ 4 
where A and B are r x r and s x s submatrices. 

Two linear operators F and G on V are said to be similar if there exists an invertible linear operator T on V 
such that G = T~! o F o T. Prove 


(a) F and G are similar if and only if, for any basis S of V, [F], and [G]; are similar matrices. 
(b) IfF is diagonalizable (similar to a diagonal matrix), then any similar matrix G is also diagonalizable. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M=[R,; R; ...] represents a matrix M with rows R,,Ro,.... 


6.37. 


6.38. 


6.39. 


6.40. 


6.41. 


6.42. 


6.43. 


(a) A=[4,5; 2,-1]; (b) B= (220,487; =98,—217); (ce) P=[l,2; 4,9); 
(d) [uly =[9a—2b, —4a+b]” and [F(v)], = [32a+47b, —14a — 21b)” 


(a) B=[-6,—-28; 4,15]; 
(b) [uly = [4a — b, —3a +10)" and [A(v)]; = [18a — 8b, 4(-13a+7b)] 


(a) [V2,-v2; v2,v2}; () (0,1; 10; © [,7; 5,-2); 
(d) [1,2; 18,-11] 


(a) [1,0,0; 0,1,0; 0,0,0}; (b) [0,0,1; 0,1,1; 1,1,1]; 
(c) 2,—7,—4; 3,1,4; 6,—8, 1] 


(a) 1,3,5; 0, —5, —10; 0, 3, 6); (b) (0, 1, 2; —1,2,3; 1,0, 0); 
(c) [15,65,104; —49,—219,—351; 29, 130, 208] 


(a) [1,1,2; 1,3,2; 1,5,2]; (b) (0; 2,14,22; 0,—5,—8] 


(a) [1 
(c) [5,1,0; 


0,0,2]; (b) [0,1,0,0; 0; 0,0,0,—3; 0,0,3,0]; 
0,0,5 
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6.44. (a) A=(0,-1; 1,0; (b) Æ+I=0 


6.45. (a) [a,0,b,0; 0,a,0,b; c,0,d,0; ,c, 0, d]; 
(b) [a,c,0,0; b,d,0,0; 0,0,a,c; 0,0,b,d]; 
(c) [0,—c,b,0; —b,a—d,0,b; c,0,d—a,—c; 0,c,—b,0] 


6.47. (a) [1,3; 2,5], [-5,3; 2,-l], [v] =[-5a+3b, 2a- b]; 
(b) [1,3; —3,—8], [-8,-3; 3,1], [v] =[-8a- 3b, 3a+ bl”; 
(c) [2,3; 5,7, [=7,3; 5,-2], [eo] =[-7a+3b, 5a—20]'; 
(d) (2,4; 3,5], [-8,2; 3,-I], [u] =[+2a+2b, ła- b)" 


6.48. (a) P=[3,5; —1,-2]; (b) O=(2,5; —1,-3] 


6.49. Here K = V3. 


1 
1 

i 2 , t 1 r 1 T 

(c) HK +3,3K-1], 4[2K—5,-5K—2]",  }[aK+b,bK -— q] 


6.50. P is the matrix whose columns are u}, u>, uz, Q = P~',[v] = Ola, b, c}. 


(a) Q=[1,0,0; 1,—1,1; —2,2,—1], [u] = fa, a—b+c, 2a + 2b cl’; 
(b) Q=([0,-2,1; 2,3,-2; —1,—1,1], [u] =[-2b+c, 2a+3b—2c, a—b4 c|”; 
(c) Q= [|-2,2,—l; 7,4,—1; 5,—3,1], [u] = [-2a + 2b — c, Ja+4b—c, 5a—3b4 cl’ 


6.52. (a) [—23,-39; 15,26]; (b) (35,41; —27,-32]; (©) [3,5; -1,-2]; (d) B=P-'\4P 
6.53. (a) (28,47; -15,-25]; (b) [13,18; —48,-10] 

6.54. (a) (43,60; —33,—46]; (œ) 13,7; —5,—9] 

6.55. [10,8,20; 13,11,28; —5,—4,—10] 

6.56. (a) [—34,57; 19,32]; (b) tr(B)=t(4)=-2; (c) det(B) = det(A) = —5 

6.57. (a) tr(F)=6,det(F)=23; (b) tr(G) =a+d,det(G) = ad — be 

6.58. (a) tr(F) =—2, det(F)=13; (b) tr(G) = 0, det(G) = 22 

6.59. (a) 4=([3,1; —2,4], B=(8,11; -2,-1]; (b) P=[lI,2; 1,3] 


6.62. (a) [2, —4, 9; $,3,=2]; (b) [3x55 1,4; 4,—2,7, 0]; (c) [2, ee eel | 


6.63. (a) [—9,1,4; 7,2, 1]; (b) [v] =[-a+2b-—c, 5a—5b+2e, 3a+ 3b cl’, and 
[G(v)]y = [2a — 11b+7c, 7b—4c]* 


6.64. (a) A= [47,85; —38,—69]; (b) B=(71,88; —41,—51] 


6.65. A= [3,11,5; —1,—8,—3] 


Inner Product Spaces, 
Orthogonality 


7.1 Introduction 


The definition of a vector space V involves an arbitrary field K. Here we first restrict K to be the real field 
R, in which case V is called a real vector space; in the last sections of this chapter, we extend our results 
to the case where K is the complex field C, in which case V is called a complex vector space. Also, we 
adopt the previous notation that 


u, v, W are vectors in V 

a,b,c,k are scalars in K 
Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise stated or implied. 
Recall that the concepts of ‘‘length’’ and ‘‘orthogonality’’ did not appear in the investigation of 
arbitrary vector spaces V (although they did appear in Section 1.4 on the spaces R” and C”). Here we 


place an additional structure on a vector space V to obtain an inner product space, and in this context 
these concepts are defined. 


7.2 Inner Product Spaces 


We begin with a definition. 


DEFINITION: Let V bea real vector space. Suppose to each pair of vectors u, v € V there is assigned 
a real number, denoted by (u, v). This function is called a (real) inner product on V if it 
satisfies the following axioms: 


[L] (Linear Property): (au, + buy, v) = a(u,, v) + blu, v). 

[L] (Symmetric Property): (u, v) = (v, u). 

[L] (Positive Definite Property): (u,u) > 0.; and (u, u) = 0 if and only if u = 0. 
The vector space V with an inner product is called a (real) inner product space. 


Axiom [I,] states that an inner product function is linear in the first position. Using [I,] and the 
symmetry axiom [I,], we obtain 


(u, cv + dv) = (cv F dv, u) _ c(v,,u) F d(v5,U) = c(u, v) T d(u, V2) 


That is, the inner product function is also linear in its second position. Combining these two properties 
and using induction yields the following general formula: 


( 2 ailli, 2 by) = 3 2 a;b;(u;, vj) 


a> 
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That is, an inner product of linear combinations of vectors is equal to a linear combination of the inner 
products of the vectors. 


EXAMPLE 7.1 Let V be a real inner product space. Then, by linearity, 
(3u; — 4u, 20, — 5v, + 6v3) = 6lu, v1) — 15 (uy, vp) + 18(u, v3) 
— 8(u, v1) + 20(u, v2) — 24(uy, v3) 


(2u—5v, 4u+ 6v) = 8(u,u) + 12(u, v} — 20(v, u} — 30(v, v) 
= 8(u,u) — 8(v,u) — 30 (v, v) 


Observe that in the last equation we have used the symmetry property that (u, v) = (v, u). 
Remark: Axiom [I,] by itself implies (0,0) = (0v,0) = 0(v,0) = 0. Thus, [L], [L], [L] are 
equivalent to [I,], [I,], and the following axiom: 
[I5] If u 4 0, then (u, u) is positive. 


That is, a function satisfying [I,], [L], [I4] is an inner product. 


Norm of a Vector 


By the third axiom [I] of an inner product, (u, u) is nonnegative for any vector u. Thus, its positive square 
root exists. We use the notation 


lul] = v (u, u) 
This nonnegative number is called the norm or length of u. The relation luli? = (u,u) will be used 
frequently. 


Remark: If ||w|| = 1 or, equivalently, if (u, u) = 1, then u is called a unit vector and it is said to be 
normalized. Every nonzero vector v in V can be multiplied by the reciprocal of its length to obtain the 
unit vector 


7 1 
v=— v 
llel 


which is a positive multiple of v. This process is called normalizing v. 


7.3 Examples of Inner Product Spaces 


This section lists the main examples of inner product spaces used in this text. 


Euclidean n-Space R” 
Consider the vector space R”. The dot product or scalar product in R” is defined by 
u: v= ab; + aby +: + abn 
where u = (a;) and v = (b;). This function defines an inner product on R”. The norm ||u|| of the vector 
u = (a;) in this space is as follows: 


jul = Vow = ya? +a} +--+ a 


On the other hand, by the Pythagorean theorem, the distance from the origin O in R° to a point 
P(a,b,c) is given by Va? + b? + c?. This is precisely the same as the above-defined norm of the 
vector v= (a,b,c) in R°. Because the Pythagorean theorem is a consequence of the axioms of 
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Euclidean geometry, the vector space R” with the above inner product and norm is called Euclidean 
n-space. Although there are many ways to define an inner product on R”, we shall assume this 
inner product unless otherwise stated or implied. It is called the usual (or standard) inner product 
on R”. 


Remark: Frequently the vectors in R” will be represented by column vectors—that is, by n x 1 
column matrices. In such a case, the formula 


(u,v) = u"v 
defines the usual inner product on R”. 


EXAMPLE 7.2 Let u= (1,3, —4,2), v= (4,—2,2,1), w= (5,—1,—2,6) in Rô. 


(a) Show (3u — 2v, w) = 3(u, w) — 2(v, w). 
By definition, 


(u,w) = 5—3 +8 +12 = 22 and (uv, w) = 20 +2 — 4+6 = 24 
Note that 3u — 2v = (—5, 13, —16, 4). Thus, 
(3u — 2v, w) = —25 — 13 + 32 + 24 = 18 


As expected, 3(u, w) — 2(v,w) = 3(22) — 2(24) = 18 = (3u — 2v, w). 


(b) Normalize u and v. 
By definition, 


lul = VI+9+16+4=v30 and  |ol|=vI6+4+4+1=5 


We normalize u and v to obtain the following unit vectors in the directions of u and v, respectively: 


1l ( 1 3 —4 2 ) r 1 (: -72 ) 
u = u = 5 j à an = T= ; NT 
llull V30 V30 V30 v30 loll 5° 555 


> 


Function Space Cļ|a, b] and Polynomial Space P(t) 


The notation C[a, b] is used to denote the vector space of all continuous functions on the closed interval 
[a, b|—that is, where a < t < b. The following defines an inner product on C[a, b], where f(t) and g(t) 
are functions in C[a, b]: 


b 


(fie) = | Flt)g(t) dt 


a 


It is called the usual inner product on C{a, b]. 
The vector space P(t) of all polynomials is a subspace of C[a, b] for any interval [a, b], and hence, the 
above is also an inner product on P(f). 


EXAMPLE 7.3 


Consider f(t) = 3t — 5 and g(t) = in the polynomial space P(t) with inner product 


(f,8) = | ros dt. 


(a) Find (f,g). 
We have f(é)g(t) = 3P — 5. Hence, 


1 
(he) =| GA -sP ar =i -3P =3-§=-4 
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(b) Find || f|| and |jgll. 
We have [f()? = f(t) (0) = 9 — 30t + 25 and [g(t)|” = t. Then 
1 1 
IIF = As | (9P — 30t + 25) dt = 3P — 15° + 25t 
0 


1 
2 
lel’ = (es) = | Aagi 


= 13 
0 


1 


0 


Therefore, || f| = VT3 and [lgl] = /$ = $ V5. 


Matrix Space M = M,,,, 
Let M = M, n»n» the vector space of all real m x n matrices. An inner product is defined on M by 
(A, B) = tr(B" A) 
where, as usual, tr( ) is the trace—the sum of the diagonal elements. If A = [a;;] and B = [b;], then 
m n 2 m n 
(4, B) = tr(B"A) = 2 L ayby and |M]? = (4,4) = 2 La 
i=1 j= i=1 j= 
That is, (4, B) is the sum of the products of the corresponding entries in A and B and, in particular, (A, A) 
is the sum of the squares of the entries of A. 
Hilbert Space 
Let V be the vector space of all infinite sequences of real numbers (a), a, a3,...) satisfying 
Co 
La = a tat <00 
i=l 
that is, the sum converges. Addition and scalar multiplication are defined in V componentwise; that is, if 
u = (a1, Gay» <) and v = (b, bz,...) 
then u+v= (a +bi,, a+b, ...) and ku = (kay, kay,...) 
An inner product is defined in v by 
(u, v) = a,b, + ab ++ 


The above sum converges absolutely for any pair of points in V. Hence, the inner product is well defined. 
This inner product space is called /,-space or Hilbert space. 


7.4 Cauchy-Schwarz Inequality, Applications 


The following formula (proved in Problem 7.8) is called the Cauchy—Schwarz inequality or Schwarz 
inequality. It is used in many branches of mathematics. 


THEOREM 7.1: | (Cauchy—Schwarz) For any vectors u and v in an inner product space V, 
2 
(u,v) < (u,u)(v,0) or | (ut, wp] < Nullo 


Next we examine this inequality in specific cases. 


EXAMPLE 7.4 
(a) Consider any real numbers a,,...,d,,, 5;,...,5,. Then, by the Cauchy—Schwarz inequality, 
(a,b, + dyby +++ + dyby)” S (ah +++ + ag)(Bj ++ + Bh) 


That is, (u - v)? < |lu|]7|J ull, where u = (a;) and v= (bj). 
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(b) Let f and g be continuous functions on the unit interval [0, 1]. Then, by the Cauchy—Schwarz inequality, 


[roe a! 2 [roa eoa 


0 0 


That is, (( f,2))? < || f|? llul]. Here V is the inner product space C(O, 1]. 


The next theorem (proved in Problem 7.9) gives the basic properties of a norm. The proof of the third 
property requires the Cauchy—Schwarz inequality. 


THEOREM 7.2: Let V be an inner product space. Then the norm in V satisfies the following 
properties: 
[Ni] lvl] > 0; and ||v|| = 0 if and only if v = 0. 
Na] [kol] = |All. 
N3] |lu+ all < llull + ol. 
The property [N3] is called the triangle inequality, because if we view u + v as the side of the triangle 


formed with sides u and v (as shown in Fig. 7-1), then [N3] states that the length of one side of a triangle 
cannot be greater than the sum of the lengths of the other two sides. 


utv 


Triangle Inequality 
Figure 7-1 
Angle Between Vectors 


For any nonzero vectors u and v in an inner product space V, the angle between u and v is defined to be 
the angle 0 such that 0 < 0 < z and 


cos 0 = (u, 0) 
ain kal 


By the Cauchy—Schwartz inequality, —1 < cos@ < 1, and so the angle exists and is unique. 


EXAMPLE 7.5 
(a) Consider vectors u = (2,3,5) and v = (1,—4,3) in R°. Then 


(u,v) =2—-124+15=5, — |lul| = V4 49425 = V38, lull = V1 +16 +9 = v26 
Then the angle 0 between u and v is given by 
cos 0 = EE: a 
V 3826 


Note that @ is an acute angle, because cos 0 is positive. 


(b) Let f(t) = 3t — 5 and g(t) = Ê in the polynomial space P(t) with inner product (f, g} = OO dt. By 
Example 7.3, 


(o= i IfI = v1, Isl = 5v5 
Then the ‘‘angle’’ 0 between f and g is given by 
— H 55 
cos 0 = 12 = 


TETE 


Note that 0 is an obtuse angle, because cos 0 is negative. 
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7.5 Orthogonality 


Let V be an inner product space. The vectors u,v € V are said to be orthogonal and u is said to be 
orthogonal to v if 


(u,v) =0 
The relation is clearly symmetric—if u is orthogonal to v, then (v, u) = 0, and so v is orthogonal to u. We 
note that 0 € V is orthogonal to every v € V, because 

(0, v) = (Ov, v) = 0(v, v) = 0 
Conversely, if u is orthogonal to every v € V, then (u, u) = 0 and hence u = 0 by [I,]. Observe that u and 


v are orthogonal if and only if cos 0 = 0, where 0 is the angle between u and v. Also, this is true if and 
only if u and v are ‘“‘perpendicular’’—that is, 0 = 2/2 (or 0 = 90°). 


EXAMPLE 7.6 
(a) Consider the vectors u = (1,1, 1), v = (1,2,—3), w = (1, —4,3) in R°. Then 
(u,v) =14+2-3=0, (u,w) =1-4+3=0, (v,w) =1-8-9=-16 


Thus, u is orthogonal to v and w, but v and w are not orthogonal. 


(b) Consider the functions sin t and cost in the vector space C[—7, 7] of continuous functions on the closed interval 
[—z, n]. Then 


T 
(sint, cos t) -| sintcost dt =} sin? t7, =0-0=0 
-n 


Thus, sin¢ and cos¢ are orthogonal functions in the vector space C[—z, 7]. 
Remark: A vector w = (x),%,...,X,) is orthogonal to u = (a),a),...,a,) in R” if 
(u, w) = ax; + ax) +--+ + A,X, = 0 


That is, w is orthogonal to u if w satisfies a homogeneous equation whose coefficients are the elements 
of u. 


EXAMPLE 7.7 Find a nonzero vector w that is orthogonal to u; = (1,2, 1) and u, = (2,5,4) in RÈ. 


Let w = (x,y,z). Then we want (u,,w) = 0 and (u, w) = 0. This yields the homogeneous system 


x+2y+ z=0 m x+2y+ z=0 
2x+5y+4z=0 y+2z=0 
Here z is the only free variable in the echelon system. Set z = 1 to obtain y = —2 and x = 3. Thus, w = (3, —2, 1) is 


a desired nonzero vector orthogonal to u; and uz. 
Any multiple of w will also be orthogonal to uw, and u. Normalizing w, we obtain the following unit vector 
orthogonal to u; and wy: 


oT Ga Fava) 


Orthogonal Complements 


Let S be a subset of an inner product space V. The orthogonal complement of S, denoted by S+ (read ‘‘S 
perp’’) consists of those vectors in V that are orthogonal to every vector u € S; that is, 


St = {vE V : (v,u) = 0 for every u € S} 
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In particular, for a given vector u in V, we have 
ut ={veEV: (v,u) =0} 


that is, u+ consists of all vectors in V that are orthogonal to the given vector u. 
We show that S+ is a subspace of V. Clearly 0 € S+, because 0 is orthogonal to every vector in V. Now 
suppose v, w € S+. Then, for any scalars a and b and any vector u € S, we have 


(av + bw, u) =a(v,u) + b(w,u) =a-04+6b-0=0 


Thus, av + bw € SŁ, and therefore S+ is a subspace of V. 
We state this result formally. 


PROPOSITION 7.3: Let S be a subset of a vector space V. Then S+ is a subspace of V. 


Remark 1: Suppose u is a nonzero vector in Rî. Then there is a geometrical description of ut. 
Specifically, w+ is the plane in R° through the origin O and perpendicular to the vector u. This is shown 
in Fig. 7-2. 


Orthogonal Complement ut 


Figure 7-2 


Remark 2: Let W be the solution space of an m x n homogeneous system 4X = 0, where A = [a,j] 
and X = |x,]. Recall that W may be viewed as the kernel of the linear mapping 4: R” — R”. Now we can 
give another interpretation of W using the notion of orthogonality. Specifically, each solution vector 
w = (x1,X2,-.-,X,) is orthogonal to each row of A; hence, W is the orthogonal complement of the row 
space of A. 


EXAMPLE 7.8 Find a basis for the subspace u% of R*, where u = (1,3, —4). 
Note that u+ consists of all vectors w = (x,y,z) such that (u, w) = 0, or x + 3y — 4z = 0. The free variables 
are y and z. 


(1) Set y=1, z=0 to obtain the solution w; = (—3, 1,0). 
(2) Set y=0, z= 1 to obtain the solution w, = (4,0, 1). 


The vectors w, and w, form a basis for the solution space of the equation, and hence a basis for u+. 


Suppose W is a subspace of V. Then both W and W+ are subspaces of V. The next theorem, whose 
proof (Problem 7.28) requires results of later sections, is a basic result in linear algebra. 


THEOREM 7.4: Let W be a subspace of V. Then V is the direct sum of W and W+; that is, 
V=woew-. 
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7.6 Orthogonal Sets and Bases 


Consider a set S = {u}, u2, . . . , u} of nonzero vectors in an inner product space V. S is called orthogonal 
if each pair of vectors in S are orthogonal, and S is called orthonormal if S is orthogonal and each vector 
in S has unit length. That is, 


(i) Orthogonal: (u;,u;) =0 for 147 


at for i £j 


(ii) Orthonormal: (u;, u; L =] 


n” 


Normalizing an orthogonal set S refers to the process of multiplying each vector in S by the reciprocal of 
its length in order to transform S into an orthonormal set of vectors. 
The following theorems apply. 


THEOREM 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent. 
THEOREM 7.6: (Pythagoras) Suppose {w,,,...,u,} is an orthogonal set of vectors. Then 


2 2 2 2 
lui +u ee ul = elle + [all e Meel 


These theorems are proved in Problems 7.15 and 7.16, respectively. Here we prove the Pythagorean 
theorem in the special and familiar case for two vectors. Specifically, suppose (u, v} = 0. Then 
[u+ ol? = (u+ v, u+ o) = (u,u) + 2(u, v) + (v, v) = (uu) + (v, 0) = lul? + lol? 


which gives our result. 


EXAMPLE 7.9 
(a) Let E = {e,,e,e;} = {(1,0,0), (0,1,0), (0,0, 1)} be the usual basis of Euclidean space RÌ. It is clear that 
(€1,€2) = (€1,83) = (€2,23) =0 and (e1,e1) = (€2,62) = (€3,63) = 1 


Namely, E is an orthonormal basis of R°. More generally, the usual basis of R” is orthonormal for every n. 


(b) Let V = C[—2, 7] be the vector space of continuous functions on the interval —x < t < n with inner product 
defined by (f,g) = f f (Ðg(t) dt. Then the following is a classical example of an orthogonal set in V: 


{1, cost, cos 2t, cos 3t,...,sint, sin 2f, sin 3t,...} 


This orthogonal set plays a fundamental role in the theory of Fourier series. 


Orthogonal Basis and Linear Combinations, Fourier Coefficients 
Let S consist of the following three vectors in R: 
u; = (1,2,1), u, = (2,1, —4), uz = (3, —2, 1) 


The reader can verify that the vectors are orthogonal; hence, they are linearly independent. Thus, S' is an 
orthogonal basis of R°. 

Suppose we want to write v = (7, 1,9) as a linear combination of u;, uz, u3. First we set v as a linear 
combination of u, u2, uz using unknowns x, x2,x3 as follows: 


U = XUy + XQUy + X3U3 or (7,1,9) = xı(1,2, 1) + x2(2, 1, —4) + x3(3, —2, 1) (*) 
We can proceed in two ways. 
METHOD 1: Expand (*) (as in Chapter 3) to obtain the system 
xX, + 2x, + 3x; = 7, 2x, +x, — 2x3; = 1, x, — 4x, +x3 =7 


Solve the system by Gaussian elimination to obtain x, = 3, x) = —1, x; = 2. Thus, 
v = 3u; — Uy + 2u3. 
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METHOD 2: (This method uses the fact that the basis vectors are orthogonal, and the arithmetic is 
much simpler.) If we take the inner product of each side of (*) with respect to u;, we get 


) (v, u;) 


(V, ui) = (X1U + XzUz + X3U3, Uj) or (V, U;) = X; (Ui, U; or X = 
(Ui, Uj) 


Here two terms drop out, because u, uz, u3 are orthogonal. Accordingly, 


(vu) 7+2+9 18 (v, u) 14+1-36 -21 

x= — =—=3 X = 2 = 
(uu) 1+4+1 6 , (uu) 4+14+16 21 

o (vu) 21-249 28 | 


(U3, U3) 9+4+4+1 14 


=-|l 


2 


Thus, again, we get v = 3u, — uy + 2u}. 


The procedure in Method 2 is true in general. Namely, we have the following theorem (proved in 
Problem 7.17). 


THEOREM 7.7: Let {u,v ,...,u,,} be an orthogonal basis of V. Then, for any v € V, 


= (v, Uy) ud (v, u2) ubed (V, Un) u 
17 2T i n 
(u1, u1) (uz, Up) (Un, Un) 
(v, u;) 
(u;, Uj) 
is analogous to a coefficient in the Fourier series of a function. This scalar also has a geometric 
interpretation, which is discussed below. 


Remark: The scalar k; = is called the Fourier coefficient of v with respect to u;, because it 


Projections 


Let V be an inner product space. Suppose w is a given nonzero vector in V, and suppose v is another 
vector. We seek the “‘projection of v along w,’ which, as indicated in Fig. 7-3(a), will be the multiple cw 
of w such that v’ = v — ew is orthogonal to w. This means 


(u—cw, w) =0 or (v, w) — c(w,w) = 0 or c= 


v — proj (v, W) 


(b) 


Figure 7-3 


Accordingly, the projection of v along w is denoted and defined by 
(v, w) 
(w, w) 


proj(v, w) = cw = 


Such a scalar c is unique, and it is called the Fourier coefficient of v with respect to w or the component of 
v along w. 


The above notion is generalized as follows (see Problem 7.25). 
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THEOREM 7.8: Suppose w),w,...,w, form an orthogonal set of nonzero vectors in V. Let v be any 
vector in V. Define 


v = v — (ciwi + Ew, +--+: + cw) 
where 
palm) padom) o a (w 
(wi, w1) (w2, w2) (W, Wy) 
Then v’ is orthogonal to w),w>,.-., W,- 


Note that each c; in the above theorem is the component (Fourier coefficient) of v along the given w;. 


Remark: The notion of the projection of a vector v € V along a subspace W of V is defined as 
follows. By Theorem 7.4, V = W @ WŁ. Hence, v may be expressed uniquely in the form 


v=w+w, where wEW and w € Wt 


We define w to be the projection of v along W, and denote it by proj(v, W), as pictured in Fig. 7-2(b). In 
particular, if W = span(w,w,...,w,), where the w, form an orthogonal set, then 


proj(v, W) = ciw; + egw, + +++ + cw, 


Here c; is the component of v along w,, as above. 


7.7 Gram-Schmidt Orthogonalization Process 


Suppose {v,,v2,...,v,} is a basis of an inner product space V. One can use this basis to construct an 
orthogonal basis {w,,w2,...,w,} of V as follows. Set 
Wi = YY 
w = i — (v, w1) wi 
(w; ’ wi) 
m = 0; (U3, w1) wi (U3, w2) 
(wi, w1) (w2, w2) 
— (Vn, W1) (Up, W2) (Uns Wn—1) 
Wh = Un n—-1 
(wi, wi) (w2, w2) (Wazi Wn=1) 


In other words, for k = 2,3,...,n, we define 


Wk = Uk — Cki Wi — Ck2W2 T °° Ck k-1Wk-1 


where cy = (vy, wi) /(w;, wi) is the component of v, along w;. By Theorem 7.8, each w, is orthogonal to 
the preceeding w’s. Thus, w,,w2,...,w, form an orthogonal basis for V as claimed. Normalizing each w; 
will then yield an orthonormal basis for V. 

The above construction is known as the Gram-Schmidt orthogonalization process. The following 
remarks are in order. 


Remark 1: Each vector w, is a linear combination of v, and the preceding w’s. Hence, one can 
easily show, by induction, that each w, is a linear combination of v1, v2,..., U,- 

Remark 2: Because taking multiples of vectors does not affect orthogonality, it may be simpler in 
hand calculations to clear fractions in any new w,, by multiplying w, by an appropriate scalar, before 
obtaining the next w,,1. 
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Remark 3: Suppose uj, u2,...,u, are linearly independent, and so they form a basis for 
U = span(u;). Applying the Gram-Schmidt orthogonalization process to the u’s yields an orthogonal 
basis for U. 
The following theorems (proved in Problems 7.26 and 7.27) use the above algorithm and remarks. 


THEOREM 7.9: Let {v,, v),..., U} be any basis of an inner product space V. Then there exists an 
orthonormal basis {u,,u5,...,u,} of V such that the change-of-basis matrix from 
{v;} to {u;} is triangular; that is, for k = 1,...,n, 


Up = Ap Vy F Ap Uy Fo H Age Uy 


THEOREM 7.10: Suppose S = {w,,w>,...,w,} is an orthogonal basis for a subspace W of a vector 
space V. Then one may extend S to an orthogonal basis for V; that is, one may find 
vectors w,,;,...,W, such that {w,,w ,...,w,} is an orthogonal basis for V. 


EXAMPLE 7.10 Apply the Gram-Schmidt orthogonalization process to find an orthogonal basis and 
then an orthonormal basis for the subspace U of R* spanned by 


vı = (1,1,1,1), v = (1,2,4,5), v3 = (1, —3, —4, —2) 
(1) First set w, = v, = (1,1,1,1). 
(2) Compute 
(v2, w1) 12 


245= = (—2,—1,1,2 
Maa a EAA 


Set w, = (—2, —1, 1,2). 
(3) Compute 


(am) ww p (8), ED 
(wi, w1) ; (w2, w2) g i 


8 17 13 T) 


U3 4 wi 10 wm= (3,-15, 10°5 


Clear fractions to obtain w, = (—6, —17, —13, 14). 
Thus, w, w2, w3 form an orthogonal basis for U. Normalize these vectors to obtain an orthonormal basis 
{u,,u,u3} of U. We have ||w,||” = 4, ||wol|? = 10, |lw5||? = 910, so 
1 1 1 
u =5(1,1,1,1), ogy oh) aig 


EXAMPLE 7.11 Let V be the vector space of polynomials f(t) with inner product 
(f,2) = fi f(t)g(t) dt. Apply the Gram-Schmidt orthogonalization process to {1,¢, £, } to find an 
orthogonal basis { h, fi, h, J} with integer coefficients for P} (t). 

Here we use the fact that, for r + s = n, 


(16, —17, —13, 14) 


peti 


1 
a) = | t dt = 
-1 


; _ J 2/(n+1) when nis even 
n+1{_, 0 when n is odd 


(1) First set fo = 1. 


1 
(2) Compute t = f 7 (1)=t-0=t. Setf, =t. 
(3) Compute 
2 (P, 1) ec t) 
(1,1) (t,t) 


Multiply by 3 to obtain f, = 3 = 1. 


(1) Q=?-S0) +0) =? -} 
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(4) Compute 


3. (PD (P, 3-1) 
a1) ua O Bei, i) 


(307 — 1) 


I 
~ 
w 
© 
en | 
— 
“~~ 
WINTON 
ss, 
~ 
x 
© 
P aa] 
W 
`~ 
N 
— 
x 
II 
~ 
U 
| 
luo 
~ 


Multiply by 5 to obtain f, = 5P — 3t. 
Thus, {1, t, 3—1, 5P — 3t} is the required orthogonal basis. 


Remark: Normalizing the polynomials in Example 7.11 so that p(1) = 1 yields the polynomials 
1, t, $3 — 1), $(50 — 3t) 


These are the first four Legendre polynomials, which appear in the study of differential equations. 


7.8 Orthogonal and Positive Definite Matrices 


This section discusses two types of matrices that are closely related to real inner product spaces V. Here 
vectors in R” will be represented by column vectors. Thus, (u,v) = uv denotes the inner product in 
Euclidean space R”. 


Orthogonal Matrices 


A real matrix P is orthogonal if P is nonsingular and P~! = P7, or, in other words, if PP’ = PTP = I. 
First we recall (Theorem 2.6) an important characterization of such matrices. 


THEOREM 7.11: Let P be a real matrix. Then the following are equivalent: (a) P is orthogonal; (b) 
the rows of P form an orthonormal set; (c) the columns of P form an orthonormal 
set. 

(This theorem is true only using the usual inner product on R”. It is not true if R” is given any other 
inner product.) 


EXAMPLE 7.12 
1/73 1/V3 1/V3 

(a) Let P= 0 1/ V2 1/ v2 |. The rows of P are orthogonal to each other and are unit vectors. Thus 
2/V6 —1/v6 —1/v6 


P is an orthogonal matrix. 


(b) Let P be a 2 x 2 orthogonal matrix. Then, for some real number 0, we have 


P= cos sin p= cos 0 sin 0 
~ |—sin@ cos ~ | sin? —cosé 


The following two theorems (proved in Problems 7.37 and 7.38) show important relationships 
between orthogonal matrices and orthonormal bases of a real inner product space V. 


THEOREM 7.12: Suppose E = {e;} and F' = {e} are orthonormal bases of V. Let P be the change- 
of-basis matrix from the basis E to the basis Æ’. Then P is orthogonal. 


THEOREM 7.13: Let {e¢),...,e,} be an orthonormal basis of an inner product space V. Let P = [a;] 
be an orthogonal matrix. Then the following n vectors form an orthonormal basis 
for V: 


I . 
ei = Q1;€1 + Q222 F `+ F Anien, i= 1,2,...,n 
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Positive Definite Matrices 


Let A be a real symmetric matrix; that is, 47 = A. Then A is said to be positive definite if, for every 
nonzero vector u in R”, 


(u, Au) = u’ Au > 0 


Algorithms to decide whether or not a matrix A is positive definite will be given in Chapter 12. However, 
for 2 x 2 matrices, we have simple criteria that we state formally in the following theorem (proved in 
Problem 7.43). 


THEOREM 7.14: A 2x2 real symmetric matrix A= is positive definite 


a b| Ja 

d| |b d 
if and only if the diagonal entries a and d are positive and the determinant 
|A| = ad — bc = ad — b°? is positive. 


EXAMPLE 7.13 Consider the following symmetric matrices: 


fi) (25h ela a] 


A is not positive definite, because |4| = 4 — 9 = —5 is negative. B is not positive definite, because the diagonal 
entry —3 is negative. However, C is positive definite, because the diagonal entries 1 and 5 are positive, and the 
determinant |C| = 5 — 4 = 1 is also positive. 

The following theorem (proved in Problem 7.44) holds. 


THEOREM 7.15: Let 4 be a real positive definite matrix. Then the function (u, v) = uv’? Av is an inner 
product on R”. 


Matrix Representation of an Inner Product (Optional) 


Theorem 7.15 says that every positive definite matrix A determines an inner product on R”. This 
subsection may be viewed as giving the converse of this result. 
Let V be a real inner product space with basis S = {u,,uy,...,u,}. The matrix 


A= [aş], where ajj = (u;, u) 
is called the matrix representation of the inner product on V relative to the basis S. 

Observe that A is symmetric, because the inner product is symmetric; that is, (u;,u;) = (u;,u;). Also, A 
depends on both the inner product on V and the basis S for V. Moreover, if S is an orthogonal basis, then 


A is diagonal, and if S is an orthonormal basis, then 4 is the identity matrix. 


EXAMPLE 7.14 The vectors u; = (1,1,0), u = (1,2,3), u, = (1,3,5) form a basis S for Euclidean 
space Rĉ. Find the matrix A that represents the inner product in R? relative to this basis S. 


First compute each (u;, u;) to obtain 


(u, 1) = 1 1 0=2, (uy, u2) = 1 2 0=3, (u1, u3) = 3 0=4 
(u, u) =14+449= 14, (u, u3} = 1 + 6 + 15 = 22, (u3, u3) = 9+25=35 
2 3 4 
Then 4= |3 14 22|. As expected, A is symmetric. 
4 22 35 


The following theorems (proved in Problems 7.45 and 7.46, respectively) hold. 


THEOREM 7.16: Let A be the matrix representation of an inner product relative to basis S for V. 
Then, for any vectors u, v € V, we have 


(u, v) = [u] A[e] 


where [u] and [v] denote the (column) coordinate vectors relative to the basis S. 
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THEOREM 7.17: Let A be the matrix representation of any inner product on V. Then J is a positive 
definite matrix. 


7.9 Complex Inner Product Spaces 


This section considers vector spaces over the complex field C. First we recall some properties of the 
complex numbers (Section 1.7), especially the relations between a complex number z = a + bi, where 
a,b € R, and its complex conjugate Z = a — bi: 

z=a’ +b’, |z| = Va? + b?, Z734235=74+5 ZZ = 22» 


Also, z is real if and only if Z = z. 


NII 
II 
N 


The following definition applies. 


DEFINITION: Let V be a vector space over C. Suppose to each pair of vectors, u,v € V there is 
assigned a complex number, denoted by (u, v). This function is called a (complex) inner 
product on V if it satisfies the following axioms: 

([*] (Linear Property) (au, + buy, v) = alu, v) + blu, v) 
([#] (Conjugate Symmetric Property) (u, v) = (v, u) 
([*] (Positive Definite Property) (u, u) > 0; and (u, u) = 0 if and only if u = 0. 


The vector space V over C with an inner product is called a (complex) inner product space. 
Observe that a complex inner product differs from the real case only in the second axiom [/#]. 
Axiom [/#] (Linear Property) is equivalent to the two conditions: 


(a) (uy +u, v) = (u, 0) + (u,v), (b) (ku, v) = k(u, v) 


On the other hand, applying [/#] and [Iž], we obtain 


(u, kv) = (kv, u) = klv, u) = k(v,u) = klu, v) 


That is, we must take the conjugate of a complex number when it is taken out of the second position of a 
complex inner product. In fact (Problem 7.47), the inner product is conjugate linear in the second 
position; that is, 


(u, av, + bv) = ā(u, v4) + Bu, v) 


Combining linear in the first position and conjugate linear in the second position, we obtain, by induction, 
(x aij, DO on) = dab; (uz, 4) 
i j ij 


The following remarks are in order. 
Remark 1: Axiom [/#] by itself implies that (0,0) = (0v, 0) = 0(v,0) = 0. Accordingly, [/#], [Z5], 
and [/}] are equivalent to [7*], [/#], and the following axiom: 
[Iž] If u Æ 0, then (u,u) > 0. 


That is, a function satisfying [/,], [/¥], and [Z] is a (complex) inner product on V. 


Remark 2: By [/#], (u,v) = (u,u). Thus, (u, u) must be real. By [Z¥], (u, u) must be nonnegative, 
and hence, its positive real square root exists. As with real inner product spaces, we define ||u|| = \/ (u, u) 
to be the norm or length of w. 


Remark 3: In addition to the norm, we define the notions of orthogonality, orthogonal comple- 
ment, and orthogonal and orthonormal sets as before. In fact, the definitions of distance and Fourier 
coefficient and projections are the same as in the real case. 
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EXAMPLE 7.15 (Complex Euclidean Space C”). Let V = C”, and let u = (z,) and v = (w,) be vectors in 
C”. Then 


(u, V) = > 24W; = 2\Wy + 22W +--+ Zp Wp 
E 


is an inner product on V, called the usual or standard inner product on C”. V with this inner product is called 
Complex Euclidean Space. We assume this inner product on C” unless otherwise stated or implied. Assuming u and 
v are column vectors, the above inner product may be defined by 

(u,v) =uld 


where, as with matrices, u means the conjugate of each element of v. If u and v are real, we have w; = w,. In this 
case, the inner product reduced to the analogous one on R”. 


EXAMPLE 7.16 


(a) Let V be the vector space of complex continuous functions on the (real) interval a < t < b. Then the following 
is the usual inner product on V: 
b 


Tas | FOZ dt 


a 


(b) Let U be the vector space of m x n matrices over C. Suppose A = (z;) and B = (w;) are elements of U. Then 
the following is the usual inner product on U: 


(A, B) = tr(B"” A) = 2 2 Wij 
= 


As usual, B” = B”; that is, B” is the conjugate transpose of B. 

The following is a list of theorems for complex inner product spaces that are analogous to those for 
the real case. Here a Hermitian matrix A (i.e., one where 4” = A? = A) plays the same role that a 
symmetric matrix A (i.e., one where A’ = A) plays in the real case. (Theorem 7.18 is proved in 
Problem 7.50.) 
THEOREM 7.18: |= (Cauchy—Schwarz) Let V be a complex inner product space. Then 


[(u, v)| < [lela 
THEOREM 7.19: Let W be a subspace of a complex inner product space V. Then V = W ẹ W+. 


THEOREM 7.20: Suppose {u,,u,...,u,,} is a basis for a complex inner product space V. Then, for 
any ve J, 
= (vm) (u, u) Lay (urun) 


7 (Uns Up) 


(uj, u1) ! (uz, u2) 


THEOREM 7.21: Suppose {u;,u2,...,u„} is a basis for a complex inner product space V. Let 
A = |a,] be the complex matrix defined by a; = (u;,u;). Then, for any u,v € V, 


(u, v) = [u] Alo] 
where [u] and [v] are the coordinate column vectors in the given basis {u;}. 
(Remark: This matrix A is said to represent the inner product on V.) 


THEOREM 7.22: Let 4 be a Hermitian matrix (i.e., 4” = AT = A) such that XTAX is real and 
positive for every nonzero vector X € C”. Then (u, v) = wu" Ad is an inner product 
on C”. 

THEOREM 7.23: Let A be the matrix that represents an inner product on V. Then A is Hermitian, and 
XTAX is real and positive for any nonzero vector in C”. 
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7.10 Normed Vector Spaces (Optional) 


We begin with a definition. 


DEFINITION: Let V be a real or complex vector space. Suppose to each v € V there is assigned a real 
number, denoted by ||v||. This function ||- || is called a norm on V if it satisfies the 
following axioms: 


[IN;] lvl] > 0; and ||v|| = 0 if and only if v = 0. 
[No] [Aull = [kllj 
[Ns] [le + oll < [lel + Iel. 


A vector space V with a norm is called a normed vector space. 
Suppose V is a normed vector space. The distance between two vectors u and v in V is denoted and 
defined by 


d(u,v) = lju — o| 


The following theorem (proved in Problem 7.56) is the main reason why d(u, v) is called the distance 
between u and v. 


THEOREM 7.24: Let V be a normed vector space. Then the function d(u, v) = ||u — v|| satisfies the 
following three axioms of a metric space: 


[M;] d(u, v) > 0; and d(u, v) = 0 if and only if u = v. 
[M] d(u, v) =d(v,u). 
[M3] d(u,v) < d(u,w) + d(w, v). 


Normed Vector Spaces and Inner Product Spaces 


Suppose V is an inner product space. Recall that the norm of a vector v in V is defined by 


loll = v (v, v) 


One can prove (Theorem 7.2) that this norm satisfies [N,], [N3], and [N3]. Thus, every inner product space 
V is anormed vector space. On the other hand, there may be norms on a vector space V that do not come 
from an inner product on V, as shown below. 


Norms on R” and C” 


The following define three important norms on R” and C”: 


(Creer) |e = max(|q;|) 
(ar +++ sn) ll, = lal + la| +--+: + lanl 


2 2 2 
Iais- -, a)l = af lanl? + la? + + + lanl 


(Note that subscripts are used to distinguish between the three norms.) The norms ||- ||, || - Ih and || - ||, 
are called the infinity-norm, one-norm, and two-norm, respectively. Observe that || - ||, is the norm on R” 
(respectively, C”) induced by the usual inner product on R” (respectively, C”). We will let d,,, di, d, 
denote the corresponding distance functions. 


EXAMPLE 7.17 Consider vectors u = (1, —5,3) and v = (4,2, —3) in R°. 
(a) The infinity norm chooses the maximum of the absolute values of the components. Hence, 
lullo =5 and [ollo =4 
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(b) The one-norm adds the absolute values of the components. Thus, 


lli =1+5+3=9 and lol =4+24+3=9 


(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by 
the usual inner product on R3 ). Thus, 


lul, = V1 +25+9= v35 and |o], = V16 + 4 +9 = v29 


(d) Because u — v=(1—4, —5—2, 3+3) = (—3,-7,6), we have 
d (u,v) =7, di(u, v) =3+7+6= 16, d,(u, v) = V9 +49 +36 = v94 


EXAMPLE 7.18 Consider the Cartesian plane R? shown in Fig. 7-4. 


(a) Let D, be the set of points u = (x,y) in R? such that ||w||, = 1. Then D, consists of the points (x, y) such that 
lull =x? +)" = 1. Thus, D, is the unit circle, as shown in Fig. 7-4. 


y 


Figure 7-4 


(b) Let D, be the set of points u = (x,y) in R? such that ||w||, = 1. Then D, consists of the points (x, y) such that 
llull = |x| + |y| = 1. Thus, D, is the diamond inside the unit circle, as shown in Fig. 7-4. 


(c) Let D; be the set of points u = (x, y) in R? such that ||w||,, = 1. Then D; consists of the points (x,y) such that 
llull = max(|x|, |y|) = 1. Thus, D, is the square circumscribing the unit circle, as shown in Fig. 7-4. 


Norms on Cia, b] 


Consider the vector space V = C/a, b] of real continuous functions on the interval a < t < b. Recall that 
the following defines an inner product on V: 


b 


TE | FOZ) dt 


a 


Accordingly, the above inner product defines the following norm on V = Cfa, b] (which is analogous to 
the ||- ||, norm on R”): 


b 
Ifl = | [FOP dt 


a 
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The following define the other norms on V = C[a, b]: 


b 
Ith | Old and | fll =max(| f(0) 
There are geometrical descriptions of these two norms and their corresponding distance functions, which 
are described below. 

The first norm is pictured in Fig. 7-5. Here 


|| fl]; = area between the function | f| and the t-axis 
d,( f,g) = area between the functions f and g 


fo) is gO 


a 


(a) [If |l; is shaded. (b) d\(f g) is shaded. 
Figure 7-5 


This norm is analogous to the norm ||- ||, on R”. 


The second norm is pictured in Fig. 7-6. Here 


|| /||,, = maximum distance between f and the t-axis 


d,.( f, Zz) = maximum distance between f and g 


This norm is analogous to the norms || - ||,, on R”. 


S(t) 
at) 
fit) 
a b t a b t 


(a) Iio b) dolf: 9) 
Figure 7-6 


SOLVED PROBLEMS 


Inner Products 
7.4. Expand: 


(a) (5u + Bu, 6v = 702), 
(b) (3u+5v, 4u—6v), 
(c) ||2u — 30l? 


Use linearity in both positions and, when possible, symmetry, (u, v) = (v, u}. 
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7.2. 


7.3. 


7.4. 


7.5. 


(a) Take the inner product of each term on the left with each term on the right: 


(Su, + 8u, 6v; — 7v) = (Su, 6v1) + (5u1, —Tv2) + (8u, 601) + (8u, —70) 
= 30(u;, v1) — 35(u1, v2) + 48 (up, v1) — 56 (uy, v2) 
[Remark: Observe the similarity between the above expansion and the expansion (5a—8b)(6c—7d ) in 
ordinary algebra.] 
(b) (3u+5v, 4u — 6v) = 12(u,u) — 18(u, v) + 20(v, u) — 30(v, v) 
= 12(u,u) + 2(u, v) — 30 (v, v) 
2u — 3v, 2u -— 3v) = 4(u,u) — 6(u, v) — 6(v, u) + 9 (v, v) 


©) |[2u = 30? = (2u - ; 
Alul? — 12(u, v) + Slv 


Consider vectors u = (1,2,4), v = (2, —3,5), w = (4,2, —3) in R°. Find 
(a) u-v, (b) u-w, (c) v-w, (A (utv)-w, (©) Jull, © lloll. 

(a) Multiply corresponding components and add to get u- v = 2 — 6 + 20 = 16. 
(b) u-w=4+4-12=-4, 

(c) v:w =8-— 6- 15 = -—13. 


(d) First find u+ v= (3,—1,9). Then (u+ v):w= 12-— 2-27 = —17. Alternatively, using [I], 
(u+ v):-w=u:-w+ v: w= —4-— 13 = —17. 
(e) First find Ijul? by squaring the components of u and adding: 


lul? = 1? +2 +4 =1+4+16=21, andso |juļ| = v21 


(f) |v]? =4+9+25 = 38, and so |||] = v38. 


Verify that the following defines an inner product in R°: 


(u, 0) = XY) — X1Y2 — X2V1 + 3x22, where u= (x1,%2), V= (1,92) 
We argue via matrices. We can write (u, v) in matrix notation as follows: 


wrote JE 


Because A is real and symmetric, we need only show that A is positive definite. The diagonal elements 1 and 
3 are positive, and the determinant ||A|| = 3 — 1 = 2 is positive. Thus, by Theorem 7.14, A is positive 
definite. Accordingly, by Theorem 7.15, (u, v) is an inner product. 


Consider the vectors u = (1,5) and v = (3,4) in R°. Find 

(a) (u,v) with respect to the usual inner product in R’. 

(b) (u,v) with respect to the inner product in R? in Problem 7.3. 
(c) ||v|| using the usual inner product in R°. 

(d) ||v|| using the inner product in R? in Problem 7.3. 


(a) (u,v) =3+20=23. 

(b) (u,v) =1-3-1-4—-5-343-5-4=3-4- 154 60 = 44. 

(c) Iul? = (v, v) = (3,4), (3,4)) = 9 + 16 = 25; hence, |u| = 5. 

(d) Jul? = (w, v) = ((3,4), (3,4)) = 9 — 12 — 12 + 48 = 33; hence, ||vl| = v33. 


Consider the following polynomials in P(t) with the inner product ( f, g) = i FT (t)g(t) dt: 
f(t) =t4+2, g(t) = 3t-2, A(t) = — 21-3 


(a) Find (f,g) and (f,A”). 
(b) Find || f|| and ||ell. 
(c) Normalize f and g. 
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7.6. 


7.7. 


(a) Integrate as follows: 
1 


1 1 
(fe) = | (t + 2)(3t — 2) a= | GP + 4t — 4) dt = (P+22-4) = -1 
0 0 0 
1 4 2 1 
ET a L sar 
a= fe 2)(r — 2t Da- (5 7 s)| = 7 
(b) (ff) =F V(t+2)dr=% hence, | fil = B= 4.57 
1 
.s)=| 3t—2)(3t-2)=1; hence, |g = V1 = 1 
0 
(c) Because || f|| = 4/57 and g is already a unit vector, we have 
A 1 3 
f= f=—=(t+2) amd ĝ=g=3t-2 
Iai? = yar") es 
Find cos @ where 0 is the angle between: 
(a) u = (1,3, —5,4) and v = (2, —3,4, 1) in Rf, 
| 87 _ fl 2 3 o r 
(b) A= k 5 i and B = k 5 g | where (A, B) = tr(B’ A). 
Use cos0 = ey 
aia 
(a) Compute: 
(u,v) =2—-9-204+4= -23, — lull? =1+94+254+16=51, Ijul? = 44+9+16+1=30 
Thus, cos 0 = = 2 


V51V30 3v170 


(b) Use (A,B) = tr(B7A) = 307, Xj- abi, the sum of the products of corresponding entries. 


i? 


(A,B) =9+164+214+24+425+4+24=119 
Use ||A||? = (A, A) = Sx", Y, a2, the sum of the squares of all the elements of A. 
i=l Qj=1 “i 
IAI? = (4,4) = 9 +8 +7 +6 +5 4+4 =271, andso |A] = v27 
IBI? = (B, B) = 12 +22 +32 +4? +5? +62 =91, andso —||BI| = VOT 
119 
V271V91 


Thus, cos 0 = 


Verify each of the following: 
(a) Parallelogram Law (Fig. 7-7): |u + ul|” + lju — vll? = 2\lu||* + 2lloll?. 
(b) Polar form for (u, v) (which shows the inner product can be obtained from the norm function): 
(u,v) = 4 (lju + oll? — lju — o1)?). 
Expand as follows to obtain 
lu + ol? = (u+ v,u + v) = lla]? + 24u, v) + Ilol? (1 
[u= oll? = (u -= vu — v) = lla? = 2(u, 0) + Mal? (2 


Add (1) and (2) to get the Parallelogram Law (a). Subtract (2) from (1) to obtain 


ae 


WN 


lu + vl? — u — vl? = 4(w, v) 
Divide by 4 to obtain the (real) polar form (b). 
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u+ 


Figure 7-7 


7.8. Prove Theorem 7.1 (Cauchy—Schwarz): For u and v in a real inner product space V, 
(uu)? < (u,u){v, o) or |(u, »)| < |lullllall- 
For any real number f, 
(ut, tu+v) =f (u,u) + 2t(u, v) + (v, v) = Ellul? + 2tlu, v) + lloll? 
Let a = lull’, b = 2(u, v), c = ||v||?. Because ||t + v||? > 0, we have 
at +bt+c>0 


for every value of t. This means that the quadratic polynomial cannot have two real roots, which implies that 
b? — 4ac < 0 or b? < 4ac. Thus, 
2 2H 2 
A(u, v) < 4llull loll 
Dividing by 4 gives our result. 


7.9. Prove Theorem 7.2: The norm in an inner product space V satisfies 
(a) [Ny] ||v|] > 0; and ||v|| = 0 if and only if v = 0. 


(b) [No] lkol] = |Allloll- 
(c) [Ns] llu + oll < lull + [loll 


Z 


(a) If v #0, then (v, v) > 0, and hence, ||v|| = \/(v, v) > 0. If v= 0, then (0,0) = 0. Consequently, 
||0|| = VO = 0. Thus, [Nj] is true. 

(b) We have kol]? = (kv, kv) = k? (v, v) = loll’. Taking the square root of both sides gives [N,]. 

(c) Using the Cauchy—Schwarz inequality, we obtain 


lu+ olf? = (u+ v, ut u) = (uu) + (u,v) + (u,v) + (0, 0) 
2 2 2 
< Jla? + 2lled ol] + ol? = Cell + lol) 


Taking the square root of both sides yields [N3]. 


Orthogonality, Orthonormal Complements, Orthogonal Sets 


7.10. Find k so that u = (1,2,k,3) and v = (3,k,7, —5) in R are orthogonal. 
First find 


(u, v) = (1,2,k,3)- (3,4,7,-5) =34+2k4+ 7k — 15 = 9k — 12 
Then set (u, v) = 9k — 12 = 0 to obtain k = 4. 


7.11. Let W be the subspace of R? spanned by u= (1,2,3,—1,2) and v= (2,4,7,2,—1). Find a 
basis of the orthogonal complement W+ of W. 
We seek all vectors w = (x,y,z,5,¢) such that 
(w,u) = x+2y+3z-— s+2t=0 
(w, v) = 2x + 4y +7z+2s— t=0 
Eliminating x from the second equation, we find the equivalent system 


x+2y+3z—s+2t=0 
z+4s—5t=0 
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7.12. 


7.13. 


7.14. 


7.15. 


The free variables are y, s, and t. Therefore, 

(1) Set y= —1, s = 0, t = 0 to obtain the solution w, = (2, —1, 0,0,0). 

(2) Sety = 0, s= 1, t = 0 to find the solution w, = (13,0, —4, 1,0). 

(3) Set y = 0, s = 0, t = 1 to obtain the solution w; = (—17,0,5,0, 1). 
The set {w}, w2, w3} is a basis of W+. 
Let w = (1,2,3,1) be a vector in Rf. Find an orthogonal basis for w+. 

Find a nonzero solution of x + 2y + 3z + t = 0, say vı = (0,0, 1, —3). Now find a nonzero solution of 
the system 

x+2y+3z+t=0, z—3t=0 
say v, = (0,—5,3,1). Last, find a nonzero solution of the system 
x+2y+3z+t=0, —S5y+3z+t=0, z—3t=0 


say v; = (—14,2,3,1). Thus, v1, v, v; form an orthogonal basis for w+. 


Let S consist of the following vectors in Rf: 
u; = (1, 1,0,—1), u, = (1,2,1,3), u = (U1, 1,—9,2), u4 = (16,—13, 1,3) 


(a) Show that S is orthogonal and a basis of R4. 

(b) Find the coordinates of an arbitrary vector v = (a,b, c,d) in Rf relative to the basis S. 

(a) Compute 
u u, =14+24+0-3=0, u u; =1+1+0-2=0, u; -u4 =16—13+0-3=0 
W -uz =1+2-9+6=0, uW -u4 = 16 — 26 + 1+9 = 0, uz -u4 = 16 — 13—9+6=0 
Thus, S is orthogonal, and S is linearly independent. Accordingly, S is a basis for R* because any four 
linearly independent vectors form a basis of R4. 

(b) Because S is orthogonal, we need only find the Fourier coefficients of v with respect to the basis vectors, 
as in Theorem 7.7. Thus, 


(vu) a+b-d (v,u3) _at+b—9c+2d 


k = k = 


(uy, Uy) 3 i 2 (u3, u3) 87 
k _ (vu) a+2b+c+3d p, — Auta) _ loa — 13b +0 + 3d 
? (uz, u2) 15 í i (u4, U4) 435 


are the coordinates of v with respect to the basis S. 


Suppose S, S,, S are the subsets of V. Prove the following: 


(a) SC sH. 
(b) If S, C S, then S4 C St. 
(c) S+ = span (S)*. 


(a) Let w € S. Then (w, v) = 0 for every v € S+; hence, w € S++. Accordingly, S C S+. 

(b) Let w € S4. Then (w, v) = 0 for every v € Sy. Because S; C S5, (w, v) = 0 for every v = S|. Thus, 
w € SŁ, and hence, S} C Sj. 

(c) Because S C span(S), part (b) gives us span(S)+ C SŁ. Suppose u € S+ and v € span(S). Then there 
exist w4, W2, -.., W in S such that v = aw, + aw +--+ + apw;. Then, using u € S+, we have 


(u,v) = (u, aW + aw + +++ + awg) = Gy (u, w1) + Ag (U, Wo) + +++ + ag (Ut, Wg) 
= a, (0) + ay(0) ee + a;(0) =0 
Thus, u € span(S)~. Accordingly, S+ C span(S ie Both inclusions give St = span(S)*. 


Prove Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly 
independent. 
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7.16. 


7.17. 


7.18. 


Suppose S = {u,,u,...,u,.} and suppose 
aju + azi +++: + a,u, = 0 (1) 
Taking the inner product of (1) with u,, we get 
0 = (0,u,) = (aju; + azu +--+ + aU, U1) 
= a, (uy, Uy) + Ay (Uy, Uy) + +++ + a, (u, 4) 
= ay (uy, Uy) +4, -O+--++a,-0 = a (u, u) 


Because u, Æ 0, we have (u,,u,;) #0. Thus, a, = 0. Similarly, for i = 2,...,r, taking the inner product of 
(1) with u,, 


0 = (0,1) = (amy +++ + aptly, 1) 
= Gy (Uy, Uj) + +++ + a; (Ui, Uj) +++ +a, (u,, Uj) = A; (Uhi, Uj) 


But (u;,u;) # 0, and hence, every a; = 0. Thus, S is linearly independent. 


EAE 


Prove Theorem 7.6 (Pythagoras): Suppose {u,,u5,...,u,} is an orthogonal set of vectors. Then 
2 2 2 2 
[u + u oe ue ||" = [ley | + uall e ee 


Expanding the inner product, we have 


[m +u +: +l? = (um Hi HH u Hu HH) 


= (uy, u) T (u, Up) apres (Uys u) + (ui, u) 
iA 
The theorem follows from the fact that (w;,u;) = ||u;||? and (u; u;) = 0 for i £j. 


1 “i 7 


Prove Theorem 7.7: Let {u,,u,...,u,,} be an orthogonal basis of V. Then for any v € V, 
= (v, u) í i (v, Uy) A pecs (v, Un) 
(uy, u) (u2, Up) (Uns Un) 


Suppose v = kyu + kyu. +--+: +4,u,. Taking the inner product of both sides with u, yields 


(U, Uy) = (kiu + kou +++ + katin, u) 
= kı (uy, uy) + ky (uy, U1) ++ + Ay (Uy U1) 
= k (u, u) thy O+-++ +k, 0 = ky (uy, uy) 


Thus, kı = te Similarly, for i= 2,...,n, 


(uy, 41) 


(v, uj) = (kyu; + kou aes + kpin, Ui) 
= kı (uy, uj) + ko (uy, ui) +++ + kp (Un, Ui) 
= ky 0+- H kj {uj,uj) +i + ka - 0 = kifu, ui) 


Thus, k; = a ui) . Substituting for k; in the equation v = k,u, +--+ k up, we obtain the desired result. 
Uy, U; 
Suppose E = {e),@),...,e,} is an orthonormal basis of V. Prove 


(a) For any u € V, we have u = (u,e))e, + (u,e))e) +--+ + (u,e,)e,. 
(b) (aye, ap or Anen» bye, she e bn) = a,b, a yb, Sa Andy: 
(c) For any u,v € V, we have (u, v) = (u,e,)(v,e,) +--+ (u,e) (0, ep). 


-n 


(a) Suppose u = kje; + kpe, +---+hk,e,. Taking the inner product of u with e}, 


(u,e1) = (kiei + kze +56 + knen, e1) 
= ky (e1,e1) + ky(en,€1) ++- + ky (en, 61) 
= k, (1) + k (0) + paR + k,(0) = kı 
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Similarly, for i= 2,...,n, 


(u, e;) (kiei F: H kie; 1 l Kens ei) 
= Ky (6158) + + + kilen ei) +o + kalen e) 
=k,(0)+---+4,(1) +---+4,(0) = k 


Substituting (u, e;) for k; in the equation u = k,e, +--+ + k,en, we obtain the desired result. 
(b) We have 


(s a,e;, 3 bo) = 2 a;b;(e;, e;) = 2 a:b (ei €;) + Saben &) 
i= j= i= 


ij= if 


But (e;,e;) = 0 for i #/, and (e;,e;) = 1 for i = j. Hence, as required, 


(Eae, Sho) = a= Haba +--+ a,b, 
i= j= i= 


(c) By part (a), we have 
u = (u,e,)e; +---+ (ue, )e, and v= (v, ee, +--+ (V, e,)en 
Thus, by part (b), 
(u, v) = (u, e1) (v, e1) + (u, e3} (v, ez) TAA (u, e„) (0, é,) 


Projections, Gram-Schmidt Algorithm, Applications 


7.19. Suppose w Æ 0. Let v be any vector in V. Show that 


E (u,w) (w,w) 


(ww) wl? 


is the unique scalar such that v’ = v — cw is orthogonal to w. 
In order for v’ to be orthogonal to w we must have 


(v— cw, w =0 or (v, w) — c(w,w) = 0 or (v, w) = c(w, w} 
(v, ud Conversely, suppose c = i w Then 


(w, w) (w, w) 


(v— cw, w) = (v, w) — c(w,w) = (v, w) — 


Thus, c 


(v, w) 

(w, w) 

7.20. Find the Fourier coefficient c and the projection of v = (1, —2,3, —4) along w = (1,2, 1,2) in R. 
Compute (v, w) = 1 — 4+3 — 8 = —8 and ||wl? = 1+4+1+4= 10. Then 


(w,w) =0 


c=—-f=-3 and proj(v, w) = cw = (—#,—§,-4,-8) 
7.21. Consider the subspace U of R spanned by the vectors: 
v; = (1,1,1,1), v = (1,1,2,4), v3 = (1,2, —4, —3) 


Find (a) an orthogonal basis of U; (b) an orthonormal basis of U. 
(a) Use the Gram-Schmidt algorithm. Begin by setting w; = u = (1,1,1,1). Next find 


— em y, = (1,1,2,4) ~ 1,1) = (—1,—1,0,2) 


Set w, = (—1,—1,0,2). Then find 


ante ata w, = (1,2, 4, —3) EE EA 


U3 
= G 3,-3,1) 


2 
Clear fractions to obtain w; = (1,3, —6,2). Then w,,w,w3 form an orthogonal basis of U. 
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7.22. 


7.23. 


7.24. 


(b) Normalize the orthogonal basis consisting of w,,w,,w3. Because |wi]? = 4, |lw,||? = 6, and 
wsl]? = 50, the following vectors form an orthonormal basis of U: 
1 1 
=-(1,1,1,1), u, = —(—1,—1,0,2), u, = —- (1,3, —6,2 
OLL), m= =z - ) 
Consider the vector space P(t) with inner product ( =p t) dt. Apply the Gram- 


Schmidt algorithm to the set {1,1,°} to obtain an pee Me . fo. ti.J2} with integer 


coefficients. 
First set fo = 1. Then find 


(t, 1) J 1 
-l=t -l=t 
(1,1) 1 2 
Clear fractions to obtain f; = 2t — 1. Then find 
(P,1) (P, 2t—1) oe é 1 
Pr TT (1) pad, aay ly=? —4(1) - rr 1) =” t+ 


Clear fractions to obtain f, = 6£ — 6t + 1. Thus, {1, 2t— 1, 6f — 6t + 1} is the required orthogonal set. 


Suppose v = (1,3,5,7). Find the projection of v onto W or, in other words, find w € W that 
minimizes ||v — w||, where W is the subspance of R* spanned by 

(a) u = (1,1,1,1) and um = (1, —3,4, —2), 

(b) v = (1,1,1,1) and v = (1,2,3,2). 


(a) Because u; and u, are orthogonal, we need only compute the Fourier coefficients: 
oe (vu)  1+3+5+7 _ 6 _, 
(mpu) 1414141 4 
o (vu) 1—-9+20-14 -2 1 
Then w = proj(v, W) = cu; + cu = 4(1,1,1,1) 5 (1, 3,4, 2) = (2,9,%,%). 
(b) Because v; and v, are not orthogonal, first apply the Gram-Schmidt algorithm to find an orthogonal 
basis for W. Set w; = vı = (1, 1,1, 1). Then find 


(U2, w1) 8 
— —+—-w, = (1,2,3,2) —-(1, 1,1, 1) = (-1,0,1,0 
U2 (wy)! (1, 5 ) gil gong ) ( at eee) ) 


Set w, = (—1,0,1,0). Now compute 

_ (v,wi) 14345 
(wiwi) 1 
(v, w2) 1+0+5+0 6 

a> Gay Osteo 2 


Then w = proj(v, W) = ciw; + cow, = 4(1, 1,1, 1) — 3(—1,0, 1,0) = (7,4, 1,4). 


cy 


Suppose w; and w, are nonzero orthogonal vectors. Let v be any vector in V. Find c; and c, so that 
v' is orthogonal to w; and wy, where uv’ = v — ciw; — caw. 
If v' is orthogonal to w,, then 
0 = (u—cywy — cyW2, Wy) = (V, W1) — C1 (W1, W1) — Co(Wo, W1) 
= (v,W1) — C1 (W1, W1) — C20 = (v, w1) — c1 (W1, W1) 
Thus, c} = (v, w1) /(w1, w1). (That is, c, is the component of v along w,.) Similarly, if v’ is orthogonal to w, 
then 
0 = (v — c1W1 — C2Wz, W2) = (V, W2) — C2 (W2, W2) 


Thus, c, = (v, w2) / (w2, w2). (That is, cy is the component of v along wy.) 
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7.25. 


7.26. 


7.27. 


7.28. 


7.29. 


7.30. 


Prove Theorem 7.8: Suppose w4, w2, ...,w, form an orthogonal set of nonzero vectors in V. Let 
v € V. Define 
v, W; 
v = v — (ciwi + ew +---+¢,w,), where ë= (awy 
(wi, w;) 
Then v’ is orthogonal to w,,w>,..., Wp. 
For i = 1,2,...,r and using (w;,w;) = 0 for i #7, we have 
(U = cW = CX = +++ — C, Wp, Wi) = (U, Wi) — C1 (Wi Wi) = ++ + — Cy Way Wi) — + — Cp (Wy Wi) 
= (v, wi) = c1: 0 So-= wy wi) =e oe 0 
= (vw) = ein) = (01) = FEA (waw) = 0 
The theorem is proved. 
Prove Theorem 7.9: Let {v,, v,...,¥,} be any basis of an inner product space V. Then there 
exists an orthonormal basis {u}, u2, . . . , Uu„} of V such that the change-of-basis matrix from {v;} to 


{u;} is triangular; that is, for k = 1,2,...,n, 
Up = Ap Vy F Ayn Vy HH Akk Vk 
The proof uses the Gram—Schmidt algorithm and Remarks | and 3 of Section 7.7. That is, apply the 
algorithm to {v,;} to obtain an orthogonal basis {w;,...,w,}, and then normalize {w;} to obtain an 


orthonormal basis {u;} of V. The specific algorithm guarantees that each w, is a linear combination of 
Ul, ..., Uz, and hence, each u, is a linear combination of v,,..., Up- 


Prove Theorem 7.10: Suppose S = {w;, w3, . . . , w,}, is an orthogonal basis for a subspace W of V. 
Then one may extend S to an orthogonal basis for V; that is, one may find vectors w,.,;,...,W,. 
such that {w,,w ,...,w,} is an orthogonal basis for V. 


Extend S to a basis S’ = {w,,...,W,, U-41,---,U,} for V. Applying the Gram-Schmidt algorithm to S’, 
we first obtain w;,w,,...,w, because S is orthogonal, and then we obtain vectors w,.,,...,W,, Where 
{w},W2,---,W,} is an orthogonal basis for V. Thus, the theorem is proved. 


Prove Theorem 7.4: Let W be a subspace of V. Then V = W ẹ WŁ. 


By Theorem 7.9, there exists an orthogonal basis {u,,...,u,} of W, and by Theorem 7.10 we can 
extend it to an orthogonal basis {u,,u,,...,u,} of V. Hence, w,.1,...,U, E€ WŁ. If v € V, then 


V= au +: + a,u, where au, +: + a,u, E€ W and a,,,u,,; +--+ +a,u, E Wt 


Accordingly, V = W + W+. 
On the other hand, if w € W N W+, then (w, w) = 0. This yields w = 0. Hence, W N W+ = {0}. 
The two conditions V = W + W+ and W N W+ = {0} give the desired result V = W ẹ W+. 


Remark: Note that we have proved the theorem for the case that V has finite dimension. We 
remark that the theorem also holds for spaces of arbitrary dimension. 


Suppose W is a subspace of a finite-dimensional space V. Prove that W = W+. 
By Theorem 7.4, V = W ® W+, and also V = W+ @ W++. Hence, 
dim W = dim V — dim W+ and dim W*™ = dim V — dim W+ 
This yields dim W = dim W+. But W C W++ (see Problem 7.14). Hence, W = W++, as required. 


Prove the following: Suppose w4 , w2, . . . , w, form an orthogonal set of nonzero vectors in V. Let v be 
any vector in V and let c; be the component of v along w,. Then, for any scalars a,,...,a,, we have 
F r 
U— },Cwg|| < |V — DL awe 
k=1 k=1 


That is, > c;w; is the closest approximation to v as a linear combination of w,,...,w,. 
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By Theorem 7.8, v — >> cw% is orthogonal to every w; and hence orthogonal to any linear combination 
of w,,W2,...,w,. Therefore, using the Pythagorean theorem and summing from k = 1 tor, 


2 2 2 2 
lv- Do awli = lu — Do cew + DE (Ce = awil = Io- DO cew +I) (cx — a) Well 


2 
2 |v = do cwl 
The square root of both sides gives our theorem. 


7.31. Suppose {e;,e2,...,e,} is an orthonormal set of vectors in V. Let v be any vector in V and let c; 
be the Fourier coefficient of v with respect to u;. Prove Bessel’s inequality: 
- 2 
> & < lloll 
Note that c; = (v, e;), because ||e;|| = 1. Then, using (e;, e;) = 0 for i # j and summing from k = 1 tor, 
we get 


0< (v-ren v- cre) = (u,v) — Cv, Dere) + Dee = (v, v) E ex (v,e4) + Vek 
= (0, 0) — Sch + Dice = (v, v) — eg 


This gives us our inequality. 


Orthogonal Matrices 


7.32. Find an orthogonal matrix P whose first row is u, = (4,4,4). 


First find a nonzero vector w, = (x,y,z) that is orthogonal to u,;—that is, for which 
ale a |) or x+2y+2z=0 
One such solution is w, = (0,1,—1). Normalize w, to obtain the second row of P: 

u = (0, 1/-V2,—1/V2) 


Next find a nonzero vector w; = (x,y,z) that is orthogonal to both u; and u,—that is, for which 


2y 2 
o= (m,w) =7+2 +20 o x+2y+2z=0 
y y 
=m a or y—-z=0 


Set z = —1 and find the solution w, = (4,—1,—1). Normalize w, and obtain the third row of P; that is, 


u, = (4/ V18, —1/V18, —1/v18). 


1 2 2 
Thus, P= 0 1v2 -1/2 
4/32 —1/3V2 -1/3V2 


We emphasize that the above matrix P is not unique. 


1 1 =l 
7.33. Let A= |1 3 4 |. Determine whether or not: (a) the rows of A are orthogonal; 
7 —5 2 


(b) A is an orthogonal matrix; (c) the columns of 4 are orthogonal. 

(a) Yes, because (1,1,—1)-(1,3,4)=1+3-4=0, (1,1-1)-(7,-5,2)=7-5-2=0, and 
(1,3,4): (7, -5,2) =7-154+8=0. 

(b) No, because the rows of A are not unit vectors, for example, (1, 1, =) =14+14+1=3. 

(c) No; for example, (1,1,7)-(1,3,-5) = 1 +3 — 35 = —31 0. 


7.34. Let B be the matrix obtained by normalizing each row of Æ in Problem 7.33. 
(a) Find B. 
(b) Is B an orthogonal matrix? 
(c) Are the columns of B orthogonal? 
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7.35. 


7.36. 


7.37. 


7.38. 


(a) We have 
10,1,- =1+1+1=3, 1,3,47 =14+9 + 16 = 26 
II(7, -5, 2)||’ = 49 +25 +4 = 78 


V3 1/3 -1/73 
Thus, B= | 1/26 3/V26 4/26 
7/V78 —5/V78 2/V718 


(b) Yes, because the rows of B are still orthogonal and are now unit vectors. 


(c) Yes, because the rows of B form an orthonormal set of vectors. Then, by Theorem 7.11, the columns of 
B must automatically form an orthonormal set. 


Prove each of the following: 


(a) P is orthogonal if and only if P7 is orthogonal. 
(b) If P is orthogonal, then P7! is orthogonal. 
(c) If P and Q are orthogonal, then PQ is orthogonal. 


(a) We have (PT)! = P. Thus, P is orthogonal if and only if PP’ = J if and only if P’’P? = I if and only if 
P” is orthogonal. 


(b) We have PT = P~', because P is orthogonal. Thus, by part (a), P~! is orthogonal. 
(c) We have P?’ =P"! and OT =Q"'. Thus, (PQ)(PO)'’ = POO’P’ = PQQ-'P—' =I. Therefore, 
(PO)' = (PQ)"', and so PQ is orthogonal. 
Suppose P is an orthogonal matrix. Show that 
(a) (Pu, Pv) = (u,v) for any u,v € V; 
(b) ||Pu|| = ||ul| for every u € F. 
Use P?P = I and (u, v) = ul v. 
(a) (Pu, Pv) = (Pu) (Pv) = uT P Pv = ul v = (u, v). 
(b) We have 
||Pul|? = (Pu, Pu) = u? P" Pu = ufu = (u, u) = |lul| 


Taking the square root of both sides gives our result. 


Prove Theorem 7.12: Suppose E = {e;} and E’ = {e} are orthonormal bases of V. Let P be the 
change-of-basis matrix from E to E’. Then P is orthogonal. 
Suppose 


e = bye, + bpe, +- + binen, i= lt (1) 
Using Problem 7.18(b) and the fact that E’ is orthonormal, we get 
Oy = (en e) = babi + baby +--+ + binbin (2) 
Let B = [b;] be the matrix of the coefficients in (1). (Then P = BT.) Suppose BB? = [c;]. Then 
Cy = babji + bib + +++ + binbjn (3) 


By (2) and (3), we have c; = 6,. Thus, BB’ = I. Accordingly, B is orthogonal, and hence, P = B7 is 
orthogonal. 


Prove Theorem 7.13: Let {e,,...,e,} be an orthonormal basis of an inner product space V. Let 
P = [a;;] be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V: 


j à 
€; = Aj, + aye H:i + Apin i=1,2,...,m 
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Because {e;} is orthonormal, we get, by Problem 7.18(b), 


(ei, €) = aiii; + azi; +> + Any; = (Cj, CG) 


where C; denotes the ith column of the orthogonal matrix P = [a;;]. Because P is orthogonal, its columns 


form an orthonormal set. This implies (e;, e;) = (C;, C;) = ô; Thus, {e;} is an orthonormal basis. 


Inner Products And Positive Definite Matrices 


7.39. 


7.40. 


7.41. 


7.42. 


Which of the following symmetric matrices are positive definite? 


3 4 8 -3 2 1 2 
@) a=|i 5] a=| 5 aO c=]; 3) p=|5 J 


Use Theorem 7.14 that a 2 x 2 real symmetric matrix is positive definite if and only if its diagonal 
entries are positive and if its determinant is positive. 
(a) No, because |A| = 15 — 16 = —1 is negative. 
(b) Yes. 
(c) No, because the diagonal entry —3 is negative. 
(d) Yes. 


Find the values of k that make each of the following matrices positive definite: 


2 —4 4 k k 5 
(a) First, k must be positive. Also, |4| = 2k — 16 must be positive; that is, 2k — 16 > 0. Hence, k > 8. 
(b) We need |B| = 36 — k? positive; that is, 36 — k? > 0. Hence, k? < 36 or -6 < k < 6. 
(c) C can never be positive definite, because C has a negative diagonal entry —2. 


Find the matrix A that represents the usual inner product on R° relative to each of the following 
bases of R’: (a) {v — (1,3), v= (2,5)}; (b) {wi a (1,2), W= (4, =2)) 


(a) Compute (v, v1) = 1 +9 = 10, (vi, v) =2+ 15 = 17, (v, vo) = 4+ 25 = 29. Thus, 
10 17|’ 
— k a] 


(b) Compute (w, w1) = 1+4 = 5, (w1, w2) = 4 — 4 = 0, (w2, w2) = 16 + 4 = 20. Thus, A= k sal 
(Because the basis vectors are orthogonal, the matrix Æ is diagonal.) 


Consider the vector space P,(t) with inner product ( f, g) = fi S (t)g(t) dt. 


(a) Find (f,g), where f(t) =t+2 and g(t) = —3t+4. 
(b) Find the matrix A of the inner product with respect to the basis {1, t, £} of V. 
(c) Verify Theorem 7.16 by showing that (f,g) = [f] TA [g] with respect to the basis {1, £, £}. 


1 1 


1 
C+D 31-44) dr= | == 
-1 


J-l 


3_ 2 [É Ê , 
(=r —2t+8)dt= o Y 


@ (fer=| 


(b) Here we use the fact that ifr +s =n, 


perl 1 


1 
PEE | t dt = 
-1 


= f2/(n+1) if nis even, 
~ 10 if n is odd. 


| 


n+l 


Dun © 
vN ODIN 
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7.43. 


7.44. 


7.45. 


(c) We have [f]" = (2,1,0) and [g]’ = (4, —3, 1) relative to the given basis. Then 
20 277 4 4 
T 
[A] Alg] = (2,1,0)]0 3 O} |-3} = (4,3,3)] -3] =F = Kha) 
2 0 2 1 1 


Prove Theorem 7.14: A = f | is positive definite if and only if a and d are positive and 
|A| = ad — b” is positive. c 
Let u = [x,y]. Then 


fu) = Wau = bol|§ a [e] -ewa 


Suppose f(u) >0 for every u#0. Then f(1,0)=a>0 and f(0,1)=d>0. Also, we have 
f(b, —a) = a(ad — b?) > 0. Because a > 0, we get ad — b? > 0. 
Conversely, suppose a > 0, b = 0, ad — b? > 0. Completing the square gives us 


2b b2 b2 bN? ad- b 
fu) =a + —xy4 v) + dy? p =a(x+ r) Ee y 
a a a a 


Accordingly, f (u) > 0 for every u # 0. 


Prove Theorem 7.15: Let A be a real positive definite matrix. Then the function (u, v} = uT Av is 
an inner product on R”. 
For any vectors u,v, and v, 
(uy + ug, v) = (u + u) Av = (uf +u )Av = u Av + ub Av = (uy, v) + (uy, v) 
and, for any scalar k and vectors u, v, 
(ku, v) = (ku) Av = ku" Av = k(u, v) 


Thus [I,] is satisfied. 
Because u’ Av is a scalar, (uT Av)" = u Av. Also, A’ = A because A is symmetric. Therefore, 


(u, v) = ul Av = (u Av)! = VATU = vAu = (v, u) 


Thus, [I] is satisfied. 
Last, because A is positive definite, XTAX > 0 for any nonzero X € R”. Thus, for any nonzero vector 
v, (v, v) = v' Av > 0. Also, (0,0) = 0740 = 0. Thus, [Iz] is satisfied. Accordingly, the function (u, v) = Av 
is an inner product. 
Prove Theorem 7.16: Let A be the matrix representation of an inner product relative to a basis S of 
V. Then, for any vectors u, v € V, we have 
(u, v) = [u] A[o] 


Suppose S = {w,,w2,...,w,} and A = [k;]. Hence, k, = (w;, w;). Suppose 


u = dW, + AW +-+- + apWn and v = biw; + bw, +: + bp, Wn 
Then uy=5 >) a;b;(w;, w;) (1) 
i=l j=l 
On the other hand, 
Ay, kn kin bi 
ky k k by 
[u]"A[o] = (41,4), + dy) ‘ 7 a 
ku Kno Kun b 
bi 
n n n b, n n 
= 6 aiki, 2 aikin, +++, 2 asku) a 3 2 a;b;kij (2) 
i= i= i= $ J=li= 
b 


Equations (1) and (2) give us our result. 
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7.46. 


Prove Theorem 7.17: Let A be the matrix representation of any inner product on V. Then A is a 
positive definite matrix. 

Because (w;,w;) = (w;, w;) for any basis vectors w; and w;, the matrix A is symmetric. Let X be any 
nonzero vector in R”. Then [u] =X for some nonzero vector u € V. Theorem 7.16 tells us that 


XTAX = [u|"A[u] = (u,u) > 0. Thus, A is positive definite. 


Complex Inner Product Spaces 


7.47. 


7.48. 


7.49. 


7.50. 


7.51. 


Let V be a complex inner product space. Verify the relation 
(u, av, + bv) = alu, vi) + blu, vp) 


Using |Z], [Z*], and then [7¥], we find 


(u, av, + bv) = (avı F bv, u) = avy, u) F b(v,u) = avy, u) + br, u) = alu, v) + blu, v) 


Suppose (u, v} = 3 + 2i in a complex inner product space V. Find 
(a) ((2 — 4i)u, v); (b) (u, (4+ 3i)v); (c) ((3— 6i)u, (5 — 2i)v). 
(a) ((2— 4i)u, v) = (2 — 4i) (u, v) = (2 — 4i)(3 + 2i) = 14 — 8i 

(b) (u, (4+ 3i)v) = (44 3i) (u, v) = (4 — 3i)(3 + 2i) = 18 — i 


(c) ((3—6i)u, (5—2i)v) = (3 — 6i)(5 2i) (u, v) = (3 — 6i) (5 + 2i)(3 + 2i) = 129 — 18i 


Find the Fourier coefficient (component) c and the projection cw of v = (3 + 4i, 2 — 3i) along 
w=(5+i, 2i) in Œ. 
Recall that c = (v, w)/(w, w). Compute 
(u, w) = (3 + 4i)(5 +2) + (2 — 3i) (2i) = (3 + 4i)(5 — i) + (2 — 3i)(-2i) 
= 194+ 17i — 6 — 4i = 134 13i 
(w,w) =25+1+4=30 


Thus, c = (13 + 13i) /30 = 3+ ṣii. Accordingly, proj(v, w) = cw = (+ Bi, -B+ $i) 
Prove Theorem 7.18 (Cauchy—Schwarz): Let V be a complex inner product space. Then 
(u,v < llulla: ; 

If v = 0, the inequality reduces to 0 < 0 and hence is valid. Now suppose v Æ 0. Using zz = |z|“ (for 
any complex number z) and (v,u) = (u, v), we expand ||u — (u, v)tv||” > 0, where ¢ is any real value: 


0 < |lu — (u, v)tull? = (u — (u,v)tv, u— (u, v)tv) 
= (u, u) E (u, v)t(u, v) = (u, v)t(v, u) + (u, v) (u, v)t(v, v) 
= lle? — 2t) (u, v)? + (u, o) PP lol? 


|(u, v)? 


2 Ej 
lvl 
root of both sides, we obtain the required inequality. 


Set t = 1/||vl|? to find 0 < |jul|? — from which |(u, v)? < ||ul|”||v||’. Taking the square 


Find an orthogonal basis for uè in C? where u = (1, i, 1+). 
Here u+ consists of all vectors s = (x,y,z) such that 


(w,u) =x—iy+(1—i)z=0 
Find one solution, say w,; = (0, 1—i, i). Then find a solution of the system 
x— iy + (1 — i)z = 0, (l+ijy—iz=0 


Here z is a free variable. Set z = 1 to obtain y = i/(1+ i) = (1 + i)/2 and x = (3i — 3)2. Multiplying by 2 
yields the solution w, = (3i—3, 1+i, 2). The vectors w, and w, form an orthogonal basis for ut. 
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7.52. Find an orthonormal basis of the subspace W of ae spanned by 
v,=(1,i,0) and v =(1, 2, 1-9). 
Apply the Gram-Schmidt algorithm. Set w, = v; = (1,i,0). Compute 
127 
een os a, 2 topes 
(wi ’ Ww) 

Multiply by 2 to clear fractions, obtaining w, = (1+2i, 2— i, 2-— 2i). Next find ||w,|| = v2 and then 

||w.|| = V18. Normalizing {w;, w2}, we obtain the following orthonormal basis of W: 


(= Gn dO) m= CP ad) 


7.53. Find the matrix P that represents the usual inner product on C? relative to the basis {1, i, 1 — i}. 
Compute the following six inner products: 


(1,1) =1, (1,i) =i= i, (,l-i)=1-i=1+i 
(i,i) ==1, (,1—i) =i1—-) =-1 Hi, (1-41-)=2 
Then, using (u, v) = (v,u), we obtain 
1 —i l+i 
P= i 1 -l +i 


(As expected, P is Hermitian; that is, P” = P.) 


Normed Vector Spaces 
7.54. Consider vectors u = (1,3, —6,4) and v = (3, —5, 1, —2) in Rf. Find 
(a) llull and [lols ©) uly and Jolli, (©) [lulls and Ilol, 
(d) d (u, v), d, (u, v), d,(u, v). 
(a) The infinity norm chooses the maximum of the absolute values of the components. Hence, 
lulls 56 and — loll, =5 
(b) The one-norm adds the absolute values of the components. Thus, 
luli =1+3+6+4= 14 and lvl]; =3+54+142=11 


(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm 
induced by the usual inner product on R°). Thus, 


lul = V1 +94+364+16=V62 and  j|jol = V9 425414 4= v39 


(d) First find u — v = (—2, 8, —7, 6). Then 
d (u, v) = |lu— v|,, = 8 
dı (u, v) = ||u — vll =2 +8 +7 +6 = 23 
dud = e-o VIF FAD 436 = VIB 


7.55. Consider the function f(t) = Ê — 4t in C[0, 3]. 
(a) Find || /||,,, (b) Plot f(t) in the plane R?, (c) Find || flj}, (d) Find || /||>. 


(a) We seek || f||,, = max(| f(t)|). Because f(t) is differentiable on [0,3], | f(t)| has a maximum at a 
critical point of f(t) (i.e, when the derivative f’(t) = 0), or at an endpoint of [0,3]. Because 
I(t) = 2t — 4, we set 2t — 4 = 0 and obtain t = 2 as a critical point. Compute 


f(2)=4-8=-4, f(0) =0-0=0, fB)=9-12= -3 
Thus, || fllo = If) =| -4| = 4. 
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(b) Compute f(t) for various values of ¢ in [0,3], for example, 
t |0 1 2 3 
falo -3 -4 -3 
Plot the points in R? and then draw a continuous curve through the points, as shown in Fig. 7-8. 


(c) We seek || f| = f | f(t)| dt. As indicated in Fig. 7-3, f(t) is negative in [0,3]; hence, 
O=- 4) = 41-2 


3 


3 3 
t 
Thus, || flh = | (4t- 1°) dt = (27 -5) =18=9=9 
0 3 0 
3 3 5 3\ j3 
t 16t 1 
(a) | f I> = | f(t) a= | (É — 8P + 161’) dt = 2f 4 ao 
Jo 0 5 3 o 5 
153 
Thus, || fll, = y -5> 
RAG) 
I 
+ +—}+-—_+—_+— : 
-1 ° i 2 3 4 

~=] 

ji 

-3 

-4 + 

-51 

Figure 7-8 


7.56. Prove Theorem 7.24: Let V be a normed vector space. Then the function d(u, v) = ||u — vl] 
satisfies the following three axioms of a metric space: 


[M,] d(u,v) > 0; and d(u, v) = 0 iff u = v. 
[M,] (u,v) = d(v,u). 
[M3] d(u, v) < d(u,w) + d(w, v). 


If u Æ v, then u — v £ 0, and hence, d(u, v) = ||u — v|| > 0. Also, d(u, u) = ||u — u|| = ||0|| = 0. Thus, 
[M,] is satisfied. We also have 


d(u, v) = ju — ol] = |] — Uv — 9) || =| = 1o- ull = Iv — ll = dfo, u) 
and (u,v) = |u — ol] = [(u—w) + (w= 0)|] < Ilu — wl + lw vf] = dfu, w) + dw, v) 


Thus, [M,] and [M3] are satisfied. 


SUPPLEMENTARY PROBLEMS 


Inner Products 


7.57. Verify that the following is an inner product on R?, where u = (x,,x>) and v = (1,9): 


f(u, v) = xyi — 2x1 y2 — 2x2 y1 + 5X2 y2 


7.58. Find the values of k so that the following is an inner product on R?, where u = (x1, x3) and v = (y1,y2): 


f(u, v) = X11 — 3x1 y2 — 3x2 y1 + ka V2 
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7.59. 


7.60. 


7.61. 


7.62. 


7.63. 


Consider the vectors u = (1, —3) and v = (2,5) in R’. Find 

(a) (u,v) with respect to the usual inner product in R°. 

(b) (u,v) with respect to the inner product in R? in Problem 7.57. 
(c) ||u|| using the usual inner product in R°. 

(d) ||u|| using the inner product in R? in Problem 7.57. 


Show that each of the following is not an inner product on R*, where u = (x1, x2,x3) and v = (y1, y2,y3): 
(a) (u,v) =x1y + X2¥2, (b) (u, v) = x1 y2%3 + Y1X2V3- 


Let V be the vector space of m x n matrices over R. Show that (4, B) = tr(B™A) defines an inner product 
in V. 


Suppose | (uw, v)| = ||w||{| vl]. (That is, the Cauchy—Schwarz inequality reduces to an equality.) Show that u 
and v are linearly dependent. 


Suppose f(u, v) and g(u, v) are inner products on a vector space V over R. Prove 


(a) The sum f +g is an inner product on V, where ( f + g)(u, v) = f (u, v) + g(u, v). 
(b) The scalar product kf, for k > 0, is an inner product on V, where (Af) (u, v) = kf (u, v). 


Orthogonality, Orthogonal Complements, Orthogonal Sets 


7.64, 


7.65. 


7.66. 


7.67. 


7.68. 


7.69. 


7.70. 


7.71. 


Let V be, the vector space of polynomials over R of degree <2 with inner product defined by 
(f, g) = hJ (Ðe(f) dt. Find a basis of the subspace W orthogonal to A(t) = 2t + 1. 


Find a basis of the subspace W of Rf orthogonal to u; = (1, —2,3,4) and u = (3, —5,7, 8). 
Find a basis for the subspace W of R° orthogonal to the vectors u; = (1,1,3,4, 1) and u = (1,2, 1,2,1). 


Let w = (1, —2,—1,3) be a vector in Rf. Find 


(a) an orthogonal basis for wt, (b) an orthonormal basis for w+. 


Let W be the subspace of R* orthogonal to u; = (1,1,2,2) and u, = (0, 1,2, —1). Find 


(a) an orthogonal basis for W, (b) an orthonormal basis for W. (Compare with Problem 7.65.) 


Let S consist of the following vectors in R*: 


u; = (1,1,1,1), u, = (1,1,-1,-1) u; = (1,-1,1,-1), u4 = (1,—1,—1,1) 


(a) Show that S is orthogonal and a basis of Ri. 
(b) Write v = (1,3,—5,6) as a linear combination of u; , uz, u3, u4. 

(c) Find the coordinates of an arbitrary vector v = (a,b,c,d) in Rf relative to the basis S. 
(d) Normalize S to obtain an orthonormal basis of R4. 


Let M = M, , with inner product (A, B) = tr(BTA). Show that the following is an orthonormal basis for M: 
1 0 0 1 0 0 0 0 
0 op jo Of}? {1 Of}? fO 1 


Let M = M, , with inner product (A, B} = tr(B7 A). Find an orthogonal basis for the orthogonal complement 
of (a) diagonal matrices, (b) symmetric matrices. 
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7.72. Suppose {u,,u2,...,u,} is an orthogonal set of vectors. Show that {k u] , kouz, ... 
for any scalars k,,ky,...,k, 


re 


, k,u,} is an orthogonal set 
7.73. Let U and W be subspaces of a finite-dimensional inner product space V. Show that 
(a) (U+W)' =UtnW-, œ) (UnW)t = U+ W+. 


Projections, Gram-Schmidt Algorithm, Applications 

7.74. Find the Fourier coefficient c and projection cw of v along w, where 
(a) v= (2,3, —5) and w = (1, —5,2) in R°. 
(b) v= (1,3,1,2) and w = (1, —2,7,4) in RÎ. 
(c) v= f and w= t+ 3 in P(t), with inner product (f, g) =f, ff 


d) v= 2 and w = 1 in M = M,,, with inner product (4, B} = tr(B™A). 
5 5 ae 


7.75. Let U be the subspace of R* spanned by 
v, = (1, 1,1, 1), v = (1,-1,2,2), v3 = (1,2, —3, —4) 
(a) Apply the Gram-Schmidt algorithm to find an orthogonal and an orthonormal basis for U. 
(b) Find the projection of v = (1,2, —3,4) onto U. 
7.76. Suppose v = (1,2,3,4,6). Find the projection of v onto W, or, in other words, find w € W that minimizes 
||v — w||, where W is the subspace of R’ spanned by 
(a) u = (1,2,1,2,1) and u = (1,-1,2,-1, 1), (b) v = (1,2,1,2,1) and v, = (1,0,1,5,-1). 
7.77. Consider the subspace W = P,(t) of P(t) with inner product ( f, g =f 7( t) dt. Find the projection of 
f(t) = Ë onto W. (Hint: Use the orthogonal polynomials 1, 2¢ — 1, F = Ms : Lee in Problem 7.22.) 
7.78. Consider P(t) with inner product ( f, g) =|, Jf (t)g(t) dt and the subspace W = P;(t). 


(a) Find an orthogonal basis for W by ete the Gram-Schmidt algorithm to {1,1,/,°}. 
(b) Find the projection of f(t) = onto W. 


Orthogonal Matrices 


1 
7.79. Find the number and exhibit all 2 x 2 orthogonal matrices of the form | z] : 


7.80. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of u = (1,1,1) and v = (1, —2, 3), 
respectively. 


7.81. Find a symmetric orthogonal matrix P whose first row is G ; +3). (Compare with Problem 7.32.) 


7.82. Real matrices A and B are said to be orthogonally equivalent if there exists an orthogonal matrix P such that 
B = P'AP. Show that this relation is an equivalence relation. 


Positive Definite Matrices and Inner Products 

7.83. Find the matrix A that represents the usual inner product on R° relative to each of the following bases: 
(a) {vi = (1,4), v = (2, —=3)}, (b) {wi = (1, =3), W= (6,2)}. 

7.84. Consider the following inner product on R?: 


f(u, 0) = xyi — 2x12 — 2x2y1 + 5x2V2, where u = (x1, X2) v = (y1, y2) 


Find the matrix B that represents this inner product on R° relative to each basis in Problem 7.83. 
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7.85. 


7.86. 


7.87. 


7.88. 


7.89. 


Find the matrix C that represents the usual basis on R? relative to the basis S of R? consisting of the vectors 
uy = (1, 1, 1), Uy = (1,2, 1), U3 = (1, —-l, 3). 


Let V = P,(t) with inner product ( f, g) = WORO) dt. 


(a) Find (f,g), where f(t) = t+ 2 and g(t) =? — 3t + 4. 
(b) Find the matrix A of the inner product with respect to the basis {1, ¢, P} of V. 
(c) Verify Theorem 7.16 that ( f, g) = [f]' A[g] with respect to the basis {1,1,7}. 


Determine which of the following matrices are positive definite: 


© Eio Ego [io L 


Suppose 4 and B are positive definite matrices. Show that: 
(a) A+B is positive definite and (b) kA is positive definite for k > 0. 


Suppose B is a real nonsingular matrix. Show that: (a) BTB is symmetric and (b) BTB is positive definite. 


Complex Inner Product Spaces 


7.90. 


7.91. 


7.92. 


7.93. 


7.94. 


7.95. 


7.96. 


7.97. 


7.98. 


Verify that 
(aju; + ay, bivi + bav) = ajbi (uy, v1) + ay by (uy, V2) + azb (up, V1) + Arby (uy, vy) 


More generally, prove that (X ja; ajm, Jj- jy) = Vi, a;b;(u;, 0). 


Consider u = (1 +i, 3, 4—i) and v= (3 — 4i, 1+i, 2i) in C°. Find 
(a) (u,v), (b) (v,u), © lull, (® lol], (© d(u, v). 
Find the Fourier coefficient c and the projection cw of 


(a) u=(3+i, 5—2i) along w= (5+i, 1+i) inC’, 
(b) u=(1—i, 3i, 1+i) along w= (1, 2—i, 3+ 2i) in ©. 


Let u = (z,,z)) and v = (w1, w2) belong to C?. Verify that the following is an inner product of C’: 


f(u, v) = ziw, + (1 + i)z,W, + (1 — iz, + 32,0, 


Find an orthogonal basis and an orthonormal basis for the subspace W of C? spanned by u; = (1,i, 1) and 
u =(1+i, 0, 2). 


Let u = (z,,z)) and v = (w1, w2) belong to C?. For what values of a,b,c,d € C is the following an inner 
product on C°? 


f(u, v) = az,W, + bz, Wy + cz,W, + dz,W, 


Prove the following form for an inner product in a complex space V: 


(u, 0) = § lle + oll? = lle = oll? + 4 lle + iol? — zlu — ioll? 


[Compare with Problem 7.7(b).] 

Let V be a real inner product space. Show that 

© ||| = ||v|| if and only if (u+ v, u—v) = 0; 

(ii) [Je + oll? = jul? + | vl]? if and only if (u, v} = 0. 


Show by counterexamples that the above statements are not true for, say, œ. 


Find the matrix P that represents the usual inner product on C? relative to the basis {1, 1 +i, 1-— 2i}. 
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7.99. A complex matrix A is unitary if it is invertible and A~! = A”. Alternatively, A is unitary if its rows 


(columns) form an orthonormal set of vectors (relative to the usual inner product of C”). Find a unitary 


matrix whose first row is: (a) a multiple of (1, 1 — i); (b) a multiple of G, li, }— 5i). 


Normed Vector Spaces 
7.100. Consider vectors u = (1, —3,4, 1, —2) and v = (3,1, —2, —3, 1) in R’. Find 
(a) lull and |u|], ©) [ul], and lolli, © [lull and Ilol, (d) d.(u, v), di (u, v), do(u, v) 


7.101. Repeat Problem 7.100 for u = (1 +i, 2—4i) and v=(1—i, 243i) in Œ. 


7.102. Consider the functions f (t) = 5t — f° and g(t) = 3t — ê in C[0,4]. Find 
(a) dx.(f58)> (b) d\(f,g), (c) d,( f,g) 


7.103. Prove (a) ||- ||, is a norm on R”. (b) ||- || is a norm on R”. 


7.104. Prove (a) ||- ||, is a norm on C[a,b]. (b) ||- || is a norm on C{a, b]. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M = [R,; R; ...] denotes a matrix M with rows R,,R,.... Also, basis need not be unique. 


7.58. k>9 

7.59. (a) —13, (b) —-71, (c) v29, (d) v89 
7.60. Let w= (0,0, 1); then (u,v) = 0 in both cases 

7.64. {7P —5t, 12° —5} 

7.65. {(1,2,1,0), (4,4,0,1)} 

7.66. (—1,0,0,0,1),(—6,2,0, 1,0), (—5,2, 1, 0,0) 


7.67. (a) u, = (0,0,3,1),u) = (0,5, -1,3),u, = (—14, —2, -1,3), 
(b) u,/V10, uz /V35,u;/V/210 


7.68. (a) (0,2,—1,0),(—15,1,2,5), (b) (0,2,—1,0)/V5, (—15, 1,2, 5) //255 


7.69. (b) v=4/( 
(c) [u] =4la4 


7.71. (a) (0, 1; 0, 0), (0, 0; 1,0], (b) [0,—1; 1,0] 
7.74. (a) c=—-2 b) c=} (c) c= 275 (d) c= 


7.75. (a) w = (1 yl) w, = (0,—2,1,1),w3 = (12, —4, —1, —7), 
= 4(-1, 12,3,6) 

7.76. (a) proj(v, W) = ¢(23, 25, 30,25,23), (b) First find an orthogonal basis for W; 

,2,1) and w, = (0,2, 0, —3,2). Then proj(v, W) = + (34, 76, 34, 56, 42) 


— 
= N 


7.77. proj(f, W) =—3P 37414 
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7.78. (a) {1, t, 3?-1, 58 — 3t}, — proj(f,W) = BP -— Ft 


7.79. Four: [a,b; b,—a], [a,b; —b,—a], [a,—b; b,a], [a,-—b; —b,—a], where a = } and b = 18 


7.80. P=[l/a,1/a,1/a; 1/b,—2/b,3/b; 5/c,—2/c,—3/c], where a = v3,b = V/14,c = v38 


7.84. 4[1,2,2; 2,—2,1; 2,1,—2] 
7.83. (a) [17,—10; -—10,13], (b) [10,0; 0,40] 
7.84. (a) [65,—68; —68,73], (b) [58,8; 8,8] 


7.85. [3,4,3; 4,6,2; 3,2,11] 


7.86. (a) = (b) [l,a,b; a,b,c; b,c,d], where a 


,b=Leaid 
7.87. (a) No, (b) Yes, (c) No, (d) Yes 

7.91. (a) —4i, (b) 4i (c) v28, (dd) vV31, (e) v59 
7.92. (a) c=H(19-5i), (b+) c=$(3+6i) 

7.94. {v =(1,i,1)/V3, v = (2i, 1-3, 3 — i)/v24} 

7.95. a and d real and positive, c = b and ad — bc positive. 

7.97. u= (1,2), v= (i,2i) 

7.98. P=|l, 1—i, 1+2; 1+i, 2, —l1+3i;; 1-2i, —1-—3i, 5] 


7.99. (a) (1/V3)[1, 1-i 1+i, —1], 
(b) [a, ai, a—ai; bi, b, 0; a, ai, —a — ai], where a = 4 and b = 1/V2. 


7.100. (a) 4and3, (b) 11 and 10, (c) v31 and /24, (d) 6,19,9 


7.101. (a) V20and V13, (b) v2+ v20 and V2+ v13, (© V22and v15, (d) 7,9, V53 


7.102. (a) 8, (b) 16, (c) 16/V3 
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Determinants 


8.1 Introduction 


Each n-square matrix A = a;j] is assigned a special scalar called the determinant of A, denoted by det(A) 
or |A| or 


ai) 42 Ain 
42, 42 An 
ant an2 ann 


We emphasize that an n x n array of scalars enclosed by straight lines, called a determinant of order n, is 
not a matrix but denotes the determinant of the enclosed array of scalars (i.e., the enclosed matrix). 

The determinant function was first discovered during the investigation of systems of linear equations. 
We shall see that the determinant is an indispensable tool in investigating and obtaining properties of 
square matrices. 

The definition of the determinant and most of its properties also apply in the case where the entries of a 
matrix come from a commutative ring. 

We begin with a special case of determinants of orders 1, 2, and 3. Then we define a determinant of 
arbitrary order. This general definition is preceded by a discussion of permutations, which is necessary for 
our general definition of the determinant. 


8.2 Determinants of Orders 1 and 2 


Determinants of orders 1 and 2 are defined as follows: 


= Gy, Aya] 
la| = a and = 411422 — 412421 
a2) 42 


Thus, the determinant of a 1 x 1 matrix A = [a,,| is the scalar a,,; that is, det(A) = |a,,| = a,,. The 
determinant of order two may easily be remembered by using the following diagram: 


J = 
3 
a 


That, is, the determinant is equal to the product of the elements along the plus-labeled arrow minus the 
product of the elements along the minus-labeled arrow. (There is an analogous diagram for determinants 
of order 3, but not for higher-order determinants.) 


EXAMPLE 8.1 

(a) Because the determinant of order 1 is the scalar itself, we have: 
det(27) = 27, det(—7) = =7, det(t — 3) =t -3 
5 3 3 2 

(b) ; g| = (6) -3(4) =30- 12= 18, = 7| = 214 10=31 


aD — 
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Application to Linear Equations 


Consider two linear equations in two unknowns, say 


azt+by=c, 


ax + boy = Cy 


Let D = a,b — ab}, the determinant of the matrix of coefficients. Then the system has a unique solution 
if and only if D Æ 0. In such a case, the unique solution may be expressed completely in terms of 
determinants as follows: 


cy b a cĉ 

ae bye, — bey ale by TE Mee O20) _ | Se 
D ab,- ab a bl D ab,- abı a, bi 

a by a b 


Here D appears in the denominator of both quotients. The numerators N, and N, of the quotients for x and 
y, respectively, can be obtained by substituting the column of constant terms in place of the column of 
coefficients of the given unknown in the matrix of coefficients. On the other hand, if D = 0, then the 
system may have no solution or more than one solution. 


4x —3y = 15 


EXAMPLE 8.2 Solve by determinants the system { gets | 


First find the determinant D of the matrix of coefficients: 


D= 


f 546-30204626 


2 5 
Because D # 0, the system has a unique solution. To obtain the numerators N, and N,, simply replace, in the matrix 
of coefficients, the coefficients of x and y, respectively, by the constant terms, and then take their determinants: 


15 -3 
1 5 


4 15 
2 1 


N, =| |=75+3=78 N, =| |=4-30= -26 


Then the unique solution of the system is 
N, 78 N,  —26 _ 


x= 3, Ja ae 


= — = ===] 
D 26 D 26 


8.3 Determinants of Order 3 


Consider an arbitrary 3 x 3 matrix A = [a,]. The determinant of A is defined as follows: 


a&i) 412) 43 
det(A) =| an az | = 411422433 + 412423431 + 413421432 — 413422431 — 412421433 — 411423432 
43; 432 433 


Observe that there are six products, each product consisting of three elements of the original matrix. 
Three of the products are plus-labeled (keep their sign) and three of the products are minus-labeled 
(change their sign). 

The diagrams in Fig. 8-1 may help us to remember the above six products in det(A). That is, the 
determinant is equal to the sum of the products of the elements along the three plus-labeled arrows in 
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Fig. 8-1 plus the sum of the negatives of the products of the elements along the three minus-labeled 
arrows. We emphasize that there are no such diagrammatic devices with which to remember determinants 
of higher order. 


Figure 8-1 


2 1 1 3 2 1 
EXAMPLE 8.3 Let4=|0 5 -2] andB= |—4 5 -—l1 |. Find det(A) and det(B). 
1 =3 4 2 —3 4 


Use the diagrams in Fig. 8-1: 


det(4) = 2(5)(4) + 1(—2)(1) + 1(-3)(0) — 1(5)(1) — (—3)(—2)(2) — 4(1) (0) 
=40-2+40-5-12-0=21 
det(B) = 60 —4+12—-10-9+32=81 


Alternative Form for a Determinant of Order 3 


The determinant of the 3 x 3 matrix A = [a,] may be rewritten as follows: 
det(A) = a; (a243 — a33432) — 412(421433 — 423431) + 413(421432 — 422431) 


an 423 a2) 423 42, 42 
=a — an + ay3 


432 433 43, 433 43; 432 


which is a linear combination of three determinants of order 2 whose coefficients (with alternating signs) 
form the first row of the given matrix. This linear combination may be indicated in the form 


4, Ai 443 ay, 42 443 ay, A2 443 
A411) 41 an a3|— Ay2] 41 an 93] 1443) A21 an a3 
43; 932 433 43, 432 433 43, 932 433 


Note that each 2 x 2 matrix can be obtained by deleting, in the original matrix, the row and column 
containing its coefficient. 


EXAMPLE 8.4 
i 2 3 E) TD EE 
4 -2 3|=10 -2 3/-2/4 By 3\/+3/4 -2 
0 5 =1 0 5 -l o 5 -1 0 5 =l 
=o 3 4 3 4 -2 
=1 =2 +3 
5 1 0 =i 0 5 
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8.4 Permutations 


A permutation o of the set {1,2,...,m} is a one-to-one mapping of the set onto itself or, equivalently, a 
rearrangement of the numbers 1,2,...,”. Such a permutation ø is denoted by 
1 2 ... an 
o=|(. . $ or o = jij: jna where j; = o(i 
(; EO 4 jjj ji = ali) 


The set of all such permutations is denoted by S,,, and the number of such permutations is n!. If o € S,,, 
then the inverse mapping o~! € S,,; and if o,t € S,,, then the composition mapping o ot € S,. Also, the 
identity mapping £ = c o o™! €S.,. (In fact, e = 123...n.) 


EXAMPLE 8.5 


(a) There are 2! = 2-1 = 2 permutations in S,; they are 12 and 21. 
(b) There are 3! = 3-2-1 = 6 permutations in S}; they are 123, 132, 213, 231, 312, 321. 


Sign (Parity) of a Permutation 


Consider an arbitrary permutation o in S,, say o = j,j.---j,. We say o is an even or odd permutation 
according to whether there is an even or odd number of inversions in c. By an inversion in o we mean a 
pair of integers (i,k) such that i > k, but i precedes k in o. We then define the sign or parity of ø, written 
sgn o, by 


a= 1 if ois even 
Sen? | -1. if ois odd 


EXAMPLE 8.6 


(a) Find the sign of o = 35142 in Ss. 
For each element k, we count the number of elements į such that i > k and i precedes k in ø. There are 


2 numbers (3 and 5) greater than and preceding 1, 
3 numbers (3,5, and 4) greater than and preceding 2, 
1 number (5) greater than and preceding 4. 


(There are no numbers greater than and preceding either 3 or 5.) Because there are, in all, six inversions, ø is 
even and sgno = 1. 


(b) The identity permutation e = 123...n is even because there are no inversions in €. 


(c) In S,, the permutation 12 is even and 21 is odd. In $3, the permutations 123, 231, 312 are even and the 
permutations 132, 213, 321 are odd. 


(d) Let t be the permutation that interchanges two numbers i and j and leaves the other numbers fixed. That is, 


(i) =j, t(j) =i, t(k) =k, where k i,j 


We call t a transposition. If i < j, then there are 2( j — i) — 1 inversions in t, and hence, the transposition t 
is odd. 


Remark: One can show that, for any n, half of the permutations in S, are even and half of them are 
odd. For example, 3 of the 6 permutations in S, are even, and 3 are odd. 


8.5. Determinants of Arbitrary Order 


Let A = [a;j] be a square matrix of order n over a field K. 
Consider a product of n elements of A such that one and only one element comes from each row and 
one and only one element comes from each column. Such a product can be written in the form 


Aj, 42), °° Any, 
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that is, where the factors come from successive rows, and so the first subscripts are in the natural order 
1,2,...,n. Now because the factors come from different columns, the sequence of second subscripts 
forms a permutation o = ji j,---j,, in S,. Conversely, each permutation in S„ determines a product of the 
above form. Thus, the matrix A contains n! such products. 


DEFINITION: The determinant of A = [a;], denoted by det(A) or |A|, is the sum of all the above n! 
products, where each such product is multiplied by sgn o. That is, 
|A| = 2 (sgn 7) ay;, az, any, 
O 
or |A| = X (sgn o)aic(1)a20(2) *** Ana(n) 


ocS, 


The determinant of the n-square matrix A is said to be of order n. 


The next example shows that the above definition agrees with the previous definition of determinants 
of orders 1, 2, and 3. 


EXAMPLE 8.7 


(a) Let A = [a,,] be a 1 x 1 matrix. Because S, has only one permutation, which is even, det(A) = a41, the number 
itself. 


(b) Let A = [aj] be a 2 x 2 matrix. In S,, the permutation 12 is even and the permutation 21 is odd. Hence, 


a&i) 442 


a= 42, a3 


= 411422 — 412421 


(c) Let A = [a,j] be a 3 x 3 matrix. In S;, the permutations 123, 231, 312 are even, and the permutations 321, 213, 
132 are odd. Hence, 


a&i A2 43 
det(4) = |a an an | = 411422433 + 412423431 + 413421432 — 413422431 — 412421433 — 411423432 
43; 432 433 


Remark: As n increases, the number of terms in the determinant becomes astronomical. 
Accordingly, we use indirect methods to evaluate determinants rather than the definition of the 
determinant. In fact, we prove a number of properties about determinants that will permit us to shorten 
the computation considerably. In particular, we show that a determinant of order n is equal to a linear 
combination of determinants of order n — 1, as in the case n = 3 above. 


8.6 Properties of Determinants 


We now list basic properties of the determinant. 


THEOREM 8.1: The determinant of a matrix A and its transpose A’ are equal; that is, |4| = |A7]. 


By this theorem (proved in Problem 8.22), any theorem about the determinant of a matrix A that 
concerns the rows of A will have an analogous theorem concerning the columns of A. 

The next theorem (proved in Problem 8.24) gives certain cases for which the determinant can be 
obtained immediately. 


THEOREM 8.2: Let A be a square matrix. 


(i) If A has a row (column) of zeros, then |4| = 0. 
(ii) If A has two identical rows (columns), then |4| = 0. 
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(iii) If A is triangular (i.e., 4 has zeros above or below the diagonal), then 
|A| = product of diagonal elements. Thus, in particular, |/| = 1, where Z is the 
identity matrix. 


The next theorem (proved in Problems 8.23 and 8.25) shows how the determinant of a matrix is 
affected by the elementary row and column operations. 


THEOREM 8.3: Suppose B is obtained from A by an elementary row (column) operation. 
(i) If two rows (columns) of A were interchanged, then |B| = — |A]. 
(ii) Ifa row (column) of A were multiplied by a scalar k, then |B| = k|A]. 


(iii) Ifa multiple of a row (column) of A were added to another row (column) of A, 
then |B| = |A]. 


Major Properties of Determinants 


We now state two of the most important and useful theorems on determinants. 


THEOREM 8.4: The determinant of a product of two matrices A and B is the product of their 
determinants; that is, 


det(AB) = det(A) det(B) 
The above theorem says that the determinant is a multiplicative function. 


THEOREM 8.5: Let A be a square matrix. Then the following are equivalent: 


(i) A is invertible; that is, A has an inverse A~!. 
(ii) AX =0 has only the zero solution. 
(iii) The determinant of A is not zero; that is, det(A) Æ 0. 


Remark: Depending on the author and the text, a nonsingular matrix A is defined to be an 
invertible matrix A, or a matrix A for which |A| Æ 0, or a matrix A for which AX = 0 has only the zero 
solution. The above theorem shows that all such definitions are equivalent. 

We will prove Theorems 8.4 and 8.5 (in Problems 8.29 and 8.28, respectively) using the theory of 
elementary matrices and the following lemma (proved in Problem 8.26), which is a special case of 
Theorem 8.4. 


LEMMA 8.6: Let E be an elementary matrix. Then, for any matrix A, |EA| = |E]|A|. 
Recall that matrices A and B are similar if there exists a nonsingular matrix P such that B = P~'AP. 


Using the multiplicative property of the determinant (Theorem 8.4), one can easily prove (Problem 8.31) 
the following theorem. 


THEOREM 8.7: Suppose 4 and B are similar matrices. Then |4| = |B|. 


8.7 Minors and Cofactors 


Consider an n-square matrix A = [a,]. Let M; denote the (n — 1)-square submatrix of A obtained by 
deleting its ith row and jth column. The determinant |M; | is called the minor of the element a,; of A, and 


we define the cofactor of a;j, denoted by 4;;, to be the “‘signed’’ minor: 


i+j 
Ay = (—1) "|M; 
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Note that the ‘‘signs’’ (17 accompanying the minors form a chessboard pattern with +’s on the main 
diagonal: 


We emphasize that M; denotes a matrix, whereas A,; denotes a scalar. 


Remark: The sign (—1 yr of the cofactor A; is frequently obtained using the checkerboard pattern. 
Specifically, beginning with + and alternating signs: 
yyy yee 


count from the main diagonal to the appropriate square. 


12 3 
EXAMPLE 8.8 Let A= |4 5 6]. Find the following minors and cofactors: (a) |M3| and 425, 
(b) |M3,| and 431. 7 8 9 


128 1,5 
(a) |My|=|4 5 6 = s| =8-4=-6 and so A>, = (—1)°*?|M,;,| = —(-6) =6 
7 8 9 
1 2 3 
2 3 143 
b) |Mz,|=|4 5 6 =F g|=12-15=-3 and so A3, = (—1) ~“ |M | = +(—3) = -3 
789 


Laplace Expansion 
The following theorem (proved in Problem 8.32) holds. 
THEOREM 8.8: (Laplace) The determinant of a square matrix A = (a; is equal to the sum of the 


products obtained by multiplying the elements of any row (column) by their 
respective cofactors: 


|A| = aj Ay + 424i + +++ + GinAin = Do ajA 
1 


|A| = ajA + ajA +--+ + ayAy = 2 
= 


The above formulas for |A| are called the Laplace expansions of the determinant of A by the ith row 
and the jth column. Together with the elementary row (column) operations, they offer a method of 
simplifying the computation of |4|, as described below. 


8.8 Evaluation of Determinants 


The following algorithm reduces the evaluation of a determinant of order n to the evaluation of a 
determinant of order n — 1. 


ALGORITHM 8.1: (Reduction of the order of a determinant) The input is a nonzero n-square matrix 
A= [a;] with n > 1. 


Step 1. Choose an element a; = 1 or, if lacking, a; + 0. 


Step 2. Using ay as a pivot, apply elementary row (column) operations to put 0’s in all the other 
positions in the column (row) containing aj. 


Step 3. Expand the determinant by the column (row) containing aj. 
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The following remarks are in order. 


Remark 1: Algorithm 8.1 is usually used for determinants of order 4 or more. With determinants 
of order less than 4, one uses the specific formulas for the determinant. 


Remark 2: Gaussian elimination or, equivalently, repeated use of Algorithm 8.1 together with row 
interchanges can be used to transform a matrix A into an upper triangular matrix whose determinant is the 
product of its diagonal entries. However, one must keep track of the number of row interchanges, because 
each row interchange changes the sign of the determinant. 

5 4 2 1 
2 3 1 -2 
=5 -7 -3 9 
1 —2 -1 4 


EXAMPLE 8.9 Use Algorithm 8.1 to find the determinant of A = 


Use a, = 1 as a pivot to put 0’s in the other positions of the third column; that is, apply the row operations 
“Replace R; by —2R, + Rj,” “Replace R} by 3R, + R3,” and ‘Replace Ry by R, + Ry.’’ By Theorem 8.3 (iii), the 
value of the determinant does not change under these operations. Thus, 


5 4 2 1 1 —2 0 5 

|A] = 2 3 1 -2 E 2 3 1 =2 
=) = = 9 1 2 0 3 

L =2 =l 4 3 1 0 2 


Now expand by the third column. Specifically, neglect all terms that contain 0 and use the fact that the sign of the 
minor M}; is (=i = —1. Thus, 


|A| = — =-|1 2 3|/=-(4—-18+5-30—3+4) =—(—38) = 38 
120 3 za 
310 2 


8.9 Classical Adjoint 


Let A = [a,] be an n x n matrix over a field K and let 4; denote the cofactor of a;;. The classical adjoint 
of A, denoted by adj A, is the transpose of the matrix of cofactors of A. Namely, 


adj A = [4; 


T 
il 


We say “‘classical adjoint’’ instead of simply ‘‘adjoint’’ because the term ‘‘adjoint’’ is currently used for 
an entirely different concept. 


EXAMPLE 8.10 Let 4 = i k . The cofactors of the nine elements of A follow: 
1 -l 5 
4n =+ s| =-18, 4n=-|) Ta an= “il-4 
4a =- 5|=-u, An = +) eu Ay =|) ae 
Ay =+| 3 “3[=-10 Ay = -|6 “3-4 4s = +14 a=- 
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The transpose of the above matrix of cofactors yields the classical adjoint of A; that is, 


—18 =i —10 
adj A = 2 l4 <4 
4 5 -8 


The following theorem (proved in Problem 8.34) holds. 


THEOREM 8.9: Let A be any square matrix. Then 
A(adj A) = (adj A)A = JAJI 


where J is the identity matrix. Thus, if |4| 4 0, 


ol 


AT! 
|4| 


(adj A) 


EXAMPLE 8.11 Let 4 be the matrix in Example 8.10. We have 


det(A) = —40 +6 +0 — 16 +4 +0 = —46 


Thus, Æ does have an inverse, and, by Theorem 8.9, 


i i —18 -11 —10 x Hā 
ile) cake aan 2 4 -4|=]|-5 -% 4 
2 4 

4 5 -8 cee 


8.10 Applications to Linear Equations, Cramer’s Rule 


Consider a system AX = B of n linear equations in n unknowns. Here 4 = [a,j] is the (square) matrix of 
coefficients and B = [b,] is the column vector of constants. Let A; be the matrix obtained from A by 
replacing the ith column of A by the column vector B. Furthermore, let 


D = det(A), N; = det(4,), M, = det(43), seh N, = det(4,,) 


The fundamental relationship between determinants and the solution of the system AX = B follows. 


THEOREM 8.10: The (square) system AX = B has a solution if and only if D Æ 0. In this case, the 
unique solution is given by 


Xn 


Ny, 
D 


The above theorem (proved in Problem 8.10) is known as Cramer ’s rule for solving systems of linear 
equations. We emphasize that the theorem only refers to a system with the same number of equations as 
unknowns, and that it only gives the solution when D # 0. In fact, if D = 0, the theorem does not tell us 
whether or not the system has a solution. However, in the case of a homogeneous system, we have the 
following useful result (to be proved in Problem 8.54). 


THEOREM 8.11: A square homogeneous system AX = 0 has a nonzero solution if and only if 
D= |4| =0. 
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1 1 1 
D=]|1 2 3)=2-6+1+4+4+3+1=5 
2 1 -1l 


Because D # 0, the system has a unique solution. To compute N,, N,, N., we replace, respectively, the coefficients 
of x,y,z in the matrix of coefficients by the constant terms. This yields 


1 1l 1 5 1 1 1l 
N,=|-1 -2 -3]=20 N,=|1 -1 -3|=-10, N,=|1 -2 -1/=15 
i, =i a. 3 =l] 2 1 


Thus, the unique solution of the system is x= N,/D = 4, y N,/D —2, z=N,/D=3; that is, the 
vector u = (4, —2, 3). 


8.11 Submatrices, Minors, Principal Minors 


Let A = [a;j] be a square matrix of order n. Consider any r rows and r columns of A. That is, consider any 
set 7 = (i1, i, . . ., i) of r row indices and any set J = (j,,/2,...,j,) of r column indices. Then J and J 
define an r x r submatrix of A, denoted by A (7; J), obtained by deleting the rows and columns of A whose 
subscripts do not belong to J or J, respectively. That is, 


AM; J) = lay: SEL, t EJ] 
The determinant |A(/;/)| is called a minor of A of order r and 


(—1)i tate Hittite tie a(T J] 


is the corresponding signed minor. (Note that a minor of order n — 1 is a minor in the sense of Section 
8.7, and the corresponding signed minor is a cofactor.) Furthermore, if /’ and J’ denote, respectively, the 
remaining row and column indices, then 


|A; J’) 
denotes the complementary minor, and its sign (Problem 8.74) is the same sign as the minor. 


EXAMPLE 8.13 Let A = [a;] be a 5-square matrix, and let Z = {1,2,4} and J = {2,3,5}. Then 
T' = {3,5} and J’ = {1,4}, and the corresponding minor |M| and complementary minor |M’| are as 
follows: 
42 43 415 
\M| = |AUsJ)| = |an as as and |M'| = AI) = 
a42 43 4s 


a31 434 
a51 as4 


Because 1 +2 +4+2+3+5= 17 is odd, —|M| is the signed minor, and —|M’| is the signed complementary 
minor. 


Principal Minors 


A minor is principal if the row and column indices are the same, or equivalently, if the diagonal elements 
of the minor come from the diagonal of the matrix. We note that the sign of a principal minor is always 
+1, because the sum of the row and identical column subscripts must always be even. 
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1 -1 


2 
EXAMPLE 8.14 Let 4 = 5 4]. Find the sums C}, C,, and C; of the principal minors of A of 
1 


3 
—3 
orders 1, 2, and 3, respectively. 
(a) There are three principal minors of order 1. These are 

oe || = 5, [=2| =—2, and so C =14+5-2=4 
Note that C, is simply the trace of A. Namely, C, = tr(A). 


(b) There are three ways to choose two of the three diagonal elements, and each choice gives a minor of order 2. 


These are 
1 2 1 -l 5 4 
s 5mh E aei i 5|=-" 


(Note that these minors of order 2 are the cofactors A33, A, and A,, of A, respectively.) Thus, 
C,=-14+1-14=-14 


(c) There is only one way to choose three of the three diagonal elements. Thus, the only minor of order 3 is the 
determinant of A itself. Thus, 


G = |A| = -10 — 24 — 3 — 15-4412 = —44 


8.12 Block Matrices and Determinants 


The following theorem (proved in Problem 8.36) is the main result of this section. 


THEOREM 8.12: Suppose M is an upper (lower) triangular block matrix with the diagonal blocks 
A1,A2,- .., Ap. Then 


det(M) = det(A,) det(A,) . . . det(4,) 
23!4 7 8 
—-1 513 2 1 
EXAMPLE 8.15 Find |M|whereM=| 0 0'2 1 5 
0 0:3 -l 4 
0 0'5 2 6 


Note that M is an upper triangular block matrix. Evaluate the determinant of each diagonal block: 


s3 2 1 5 
= 10 +3 = 13, 3 =1 4|=—12+20+30+25-— 16-— 18 = 29 
= a 5 2 6 


Then |M| = 13(29) = 377. 
A B] 
C D| 
true that |M| = |A||D| — |B||C|. (See Problem 8.68.) 


, where A,B,C,D are square matrices. Then it is not generally 


Remark: Suppose M = | 


8.13 Determinants and Volume 


Determinants are related to the notions of area and volume as follows. Let u4, u2, .. . , U„ be vectors in R”. 
Let S be the (solid) parallelopiped determined by the vectors; that is, 


S = {aju +a +---+a,u,:0<a; <1 fori=1,...,n} 


(When n = 2,8 is a parallelogram.) Let V(S) denote the volume of S (or area of S when n = 2). Then 
V(S) = absolute value of det (A) 
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where A is the matrix with rows u, uz, ..., up. In general, V(S) = 0 if and only if the vectors u,,..., un 
do not form a coordinate system for R” (i.e., if and only if the vectors are linearly dependent). 


EXAMPLE 8.16 Let u; = (1,1,0), u = (1,1,1), u, = (0,2,3). Find the volume V(S) of the parallelo- 
piped S in R? (Fig. 8-2) determined by the three vectors. 


Figure 8-2 


Evaluate the determinant of the matrix whose rows are u), Uz, u3: 


Hence, V(S) = | — 2| = 2. 


8.14 Determinant of a Linear Operator 


Let F be a linear operator on a vector space V with finite dimension. Let A be the matrix representation of 
F relative to some basis S of V. Then we define the determinant of F, written det(F), by 


det(F) = |A| 
If B were another matrix representation of F relative to another basis S’ of V, then A and B are similar 


matrices (Theorem 6.7) and |B| = |A| (Theorem 8.7). In other words, the above definition det(F) is 
independent of the particular basis S of V. (We say that the definition is well defined.) 


The next theorem (to be proved in Problem 8.62) follows from analogous theorems on matrices. 


THEOREM 8.13: Let F and G be linear operators on a vector space V. Then 
(i) det(Fo G) = det(F) det(G). 
(ii) F is invertible if and only if det(F) Æ 0. 


EXAMPLE 8.17 Let F be the following linear operator on R? and let A be the matrix that represents F 
relative to the usual basis of R°: 


2 —4 1 
F (x,y,z) = (2x — 4y +z, x— 2y + 3z, 5x+y—z) and A=]1 -2 3 
5 1 -l 


Then 
det(F) = |A| = 4 — 60 + 1 + 10 — 6 — 4 = —55 
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8.15 Multilinearity and Determinants 


Let V be a vector space over a field K. Let ./ = V”; that is, ef consists of all the n-tuples 
A = (41,42, EE Án) 


where the 4; are vectors in V. The following definitions apply. 


DEFINITION: A function D: .% — K is said to be multilinear if it is linear in each component: 
(G) IfA; =B +C, then 
D(A) = D(..., B+C, ...) = D(...,B,...,) + D(...,C,...) 
(ii) If A; = kB, where k € K, then 
D(A) = D(...,kB,...) = kD(...,B,...) 


We also say n-linear for multilinear if there are n components. 


DEFINITION: A function D:.o — K is said to be alternating if D(A) = 0 whenever A has two 
identical elements: 


D(A,,4>,---,4,) = 0 whenever 4;=4;,, i#j 


Now let M denote the set of all m-square matrices A over a field K. We may view A as an n-tuple 
consisting of its row vectors 4,,4,,...,A,; that is, we may view A in the form A = (A), 4),...,A,). 


, n 


The following theorem (proved in Problem 8.37) characterizes the determinant function. 


THEOREM 8.14: There exists a unique function D: M — K such that 
(i) D is multilinear, (ii) D is alternating, (iii) D(/) = 1. 


This function D is the determinant function; that is, D(A) = |A|, for any matrix 
AEM. 


SOLVED PROBLEMS 


Computation of Determinants 


8.1. Evaluate the determinant of each of the following matrices: 


6 5 2 =3 4 -5 t—5 6 
Use the formula | “ A = ad — bc: 


(a) |A| = 6(3) — 5(2) = 18— 10 = 8 

(b) |B| = 144+ 12 = 26 

(c) |C] = -8 — 5 = -13 

(d) |D| = (t— 5)(t+ 2) — 18 =f — 3t — 10 — 18 =f — 10t — 28 


8.2. Evaluate the determinant of each of the following matrices: 


234 i -2 3 i 3 -5 
(a) A=|5 4 3|,® B=|2 4 -1|,@ c=|3 -1 2 
121 I1 5 =2 1 -2 1 
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Use the diagram in Fig. 8-1 to obtain the six products: 
(a) |A| = 2(4)(1) + 3(3)(1) + 4(2)(5) — 1(4)(4) — 2(3)(2) — 1(3)(5) = 8 + 9 + 40 — 16 — 12 —15 = 14 
(b) |B| = -8+2+30-—124+5-8=9 
(c) |[C|=-1+6+30-5+4-9=25 


8.3. Compute the determinant of each of the following matrices: 


ae re Bi 
@ 4=|5 6 7/1,@ B=] o5 6/© C=|3 1 1 
Di 0 00 3 1 —4 1 


(a) One can simplify the entries by first subtracting twice the first row from the second row—that is, by 
applying the row operation ‘“‘Replace R, by —2; + R3.” Then 


2 3 4| |2 3 
|4| =|5 
8 


(b) B is triangular, so |B| = product of the diagonal entries = —120. 
(c) The arithmetic is simpler if fractions are first eliminated. Hence, multiply the first row R, by 6 and the 
second row R, by 4. Then 
3 —6 -2 23 7 
|™24C|=]|3 2 -4)=6+244+24+4- 48 +18 = 28, so |C ===- 
l 4 1 24 6 


8.4. Compute the determinant of each of the following matrices: 


Sue 26 6 2 1 0 5 
2. Sr tie il 

(a) A= 7 = = = |, œ B=| 1 1 2 -2 3 
wa aS 3 0 2 3 =1 


=] =l -3 4 2 


(a) Use a3, = 1 as a pivot to put 0’s in the first column, by applying the row operations ‘‘Replace R, by 
—2R, + Ry,’ “Replace R, by 2R, + R3,” and “‘Replace Ry by R3 + R4.” Then 


2 S 3° 22 0 -iI 1 —6 


-1 1 —6 
-2 -3 2 -5| [a 2 41 

|A| = = =| 3 -2 -1 
132 2| C 

=3 i5 
-1 -6 4 3| l0 -3 2 5 


= 10 +3 — 36 + 36 — 2 — 15 = —4 


(b) First reduce |B| to a determinant of order 4, and then to a determinant of order 3, for which we can use 
Fig. 8-1. First use c, = 1 as a pivot to put 0’s in the second column, by applying the row operations 
“Replace R, by —2R, + R,,”’ “Replace R} by —R, + R3,” and “‘Replace R; by R, + R5.” Then 


my 4 3 
2 -1 4 14 5 
Dei 
=1 10 0 0 0 
|B| =|- 1 0 2J= = 
3 23 -1 5 W s — 
A 3 -1 
: 1-2 2 3| |-1 @: 7 


=. ne KB UU FB NN 
NY WwW bh OOo OF OG 
| 
N 
N 
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Cofactors, Classical Adjoints, Minors, Principal Minors 


2 1 -3 4 
5 -4 7 -2 
4 0 6 -3 
3 =2 5 2 


8.5. LetA= 


(a) Find 43, the cofactor (signed minor) of 7 in A. 

(b) Find the minor and the signed minor of the submatrix M = A(2,4; 2,3). 

(c) Find the principal minor determined by the first and third diagonal entries—that is, by 
M=A(1,3; 1,3). 

(a) Take the determinant of the submatrix of A obtained by deleting row 2 and column 3 (those which 
contain the 7), and multiply the determinant by (—1) m 


2 1 4 

3 =2 2 
The exponent 2 + 3 comes from the subscripts of A,;—that is, from the fact that 7 appears in row 2 and 
column 3. 

(b) The row subscripts are 2 and 4 and the column subscripts are 2 and 3. Hence, the minor is the 
determinant 
M| = | 022 %3] = 2 4 =-204+14=-6 
a42 443 —2 5 


and the signed minor is (—1)****7*?|M| = —|M| = —(—6) = 6. 
(c) The principal minor is the determinant 
ai 443 


m= 4 6 


=|; “g[= 12+ 12 =24 
43, 433 


Note that now the diagonal entries of the submatrix are diagonal entries of the original matrix. Also, the 
sign of the principal minor is positive. 


1 1 1 
8.6. LetB=|2 3 4]. Find: (a) |B|, (bÐ) adj B, (c) B`! using adj B. 
5 8 9 


(a) |B| = 27 +20 + 16 — 15 — 32 — 18 = —2 
(b) Take the transpose of the matrix of cofactors: 


3 4 2 4 2 3 
na a ae =5 3 i J5 =r i 
, 11 11 11 
adj B = | — -— =|-1 4 3|=| 2 4 2 
8 9 5 9 5 8 2 LS i 
1 1 i 4 i 4 7 E 
3 4 2 4 2 3 
i ia 1 > 3 73 
(c) Because |B| # 0, B~ == (adj B) = — 2 4 -2;/=]-1 -2 1 
|B| 2| eG o 3 
2 2 2 


1 2 3 
8.7. Le A=|4 5 6}, and let S, denote the sum of its principal minors of order k. Find S, for 
0 7 8 


(a) k=1, (b+) k=2, (©) k=3. 
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(a) The principal minors of order 1 are the diagonal elements. Thus, S; is the trace of A; that is, 
Sı =tr(A) =1+5+8=14 


(b) The principal minors of order 2 are the cofactors of the diagonal elements. Thus, 


Rater el aia 
= Ai 22 B17 g 0 8 4 5 


|=-248-3=3 


(c) There is only one principal minor of order 3, the determinant of A. Then 
S; = |A| = 40 + 0 + 84 — 0 — 42 — 64 = 18 


1 3 0 -1l 
-4 2 5 1 i a f 
8.8. Let A= l 03 2l Find the number N, and sum S, of principal minors of order: 
3 —2 1 4 


(a) k=1, (b) k=2, (c) k=3, (d k=4. 
Each (nonempty) subset of the diagonal (or equivalently, each nonempty subset of {1,2,3,4}) 


i of them are of order k. 


w e aa aa aa 


determines a principal minor of A, and N, = ( 


(a) S =|1]+|2|+|3|+|4/=14+24+3+4=10 
(b) S, = 1 +], +h “alte s+ 2 i f p 
-4 2 1 3 3 4 0 3 —2 4 1 4 
=14+3+7+6+ 10+ 14 = 54 
1 3 0 1 3 -1 10 =i 25 1 
(ce) S;=|-4 2 5/+}]-4 2 1)4+]1 3 -2/+] 0 3 -2 
1 0 3 3 2 4 3 1 4 —2 1 4 


= 57 +65 +22 + 54 = 198 
(d) S4 = det(4) = 378 


Determinants and Systems of Linear Equations 


3y+2x=z+1 
8.9. Use determinants to solve the system 4 3x + 2z = 8 — Sy. 
3z—-l=x-2y 


First arrange the equation in standard form, then compute the determinant D of the matrix of 
coefficients: 


2x+3y- z= 1 2 
3x+5y+2z= 8 and D=\|3 5 2| = —-30+6+6+5+8+27=22 
x— 2y — 3z = —1 1 


Because D # 0, the system has a unique solution. To compute Ny, N,, Nz, we replace, respectively, the 
coefficients of x,y,z in the matrix of coefficients by the constant terms. Then 
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Thus, 
ae ea ee ee -Ma 
D 22 i D 22 , D 22 
kx+y+z=1 
8.10. Consider the system 4 x +hky+z= 1 
xtyt+k=1 


Use determinants to find those values of k for which the system has 
(a) a unique solution, (b) more than one solution, (c) no solution. 


(a) The system has a unique solution when D Æ 0, where D is the determinant of the matrix of coefficients. 
Compute 


1 
1| =K +1+1-k-k-k=k —3k4+2=(k-1)(k +2) 
k 


Thus, the system has a unique solution when 
(k—1)°(k+2) 40, whenk+#1andk#2 
(b and c) Gaussian elimination shows that the system has more than one solution when k = 1, and the 


system has no solution when k = —2. 


Miscellaneous Problems 

8.11. Find the volume V (S) of the parallelepiped S in R? determined by the vectors: 
(a) u; = (1,1,1), u = (1,3, —4), u = (1,2, —5). 
(b) u = (1,2,4), u = (2,1, -3),u3 = (5,7, 9). 


V(S) is the absolute value of the determinant of the matrix M whose rows are the given vectors. Thus, 


1 1 1 
(a) |M|=|1 3 -4| =-15-4+2-3+8+5= -7. Hence, V(S) = | -7| =7. 
l 2 5 
1 2 4 
(b) |M|=|2 1 -3|=9-—30+4 56 — 20+ 21 — 36 = 0. Thus, V (S) = 0, or, in other words, t, uz, u3 
5 7 9 


lie in a plane and are linearly dependent. 


3 4 0 0 0 3 41010 0 
25000 2 5!0!0 0 
8.12. Find det(M) where M= |0 9 2 0 OJ =|0 91210 0 
0 5 0 6 7 05,0,6 7 
0043 4 00:41:13 4 
M is a (lower) triangular block matrix; hence, evaluate the determinant of each diagonal block: 
3 4 6 7 
E 5 =15-8=7, |2| = 2, E 4 =24-21=3 


Thus, |M| = 7(2)(3) = 42. 


8.13. Find the determinant of F: R? — R? defined by 
F (x,y,z) = (x + 3y — 4z, 2y+7z, x + 5y — 3z) 
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The determinant of a linear operator F is equal to the determinant of any matrix that represents F. Thus 
first find the matrix A representing F in the usual basis (whose rows, respectively, consist of the coefficients 
of x,y,z). Then 


ll 
ion 
+ 
iia 
ja 
sa 
w 
A 
So 

ll 
oo 


1 3 —4 
A=|0 2 7}, and so det(F) = |A| 
I 5 =3 


8.14. Write out g = g(x,,x2,x3,x4) explicitly where 


ECX X2 Xn) = IG; = x). 
I<J 
The symbol |] is used for a product of terms in the same way that the symbol > is used for a sum of 


terms. That is, | [;<; (x; — x;) means the product of all terms (x; — x;) for which i < j. Hence, 


g = &(x1,--- 5 %4) = (X1 — Xa) (%1 — ¥3)(%1 — x4) (x2 — x3) (x2 — x4) (X3 — x4) 
8.15. Let D be a 2-linear, alternating function. Show that D(A, B) = —D(B, A). 
Because D is alternating, D(A, A) = 0, D(B, B) = 0. Hence, 
D(A + B,A + B) =D(A,A) + D(A, B) + D(B,A) + D(B, B) = D(A, B) + D(B,A) 


However, D(A + B, A + B) = 0. Hence, D(A, B) = —D(B, A), as required. 


Permutations 
8.16. Determine the parity (sign) of the permutation o = 364152. 


Count the number of inversions. That is, for each element k, count the number of elements i in o such 
that i > k and i precedes k in o. Namely, 


k=1: 3 numbers (3, 6,4) k=4: 1 number (6) 
k=2: 4 numbers (3,6,4,5) k=5: 1 number (6) 
k=3: 0 numbers k=6: 0 numbers 


Because 3+4+0+1+1+0= 9 is odd, o is an odd permutation, and sgn o = —1. 
8.17. Let o = 24513 and t = 41352 be permutations in S;. Find (a) too, (b) o!. 
Recall that o = 24513 and t = 41352 are short ways of writing 


(a) The effects of o and then t on 1,2,3,4,5 are as follows: 
1-2-1, 234-5, 335-2, 4-1-4, 5-3-3 
[That is, for example, (to ¢)(1) = t(o(1)) = t(2) = 1.] Thus, to ø = 15243. 
(b) By definition, o~'(j) = k if and only if o(k) = j. Hence, 


a [24513\_ /12345 E 
aTr a oe ae SA 


8.18. Let o = ji jz - - -jn be any permutation in S„. Show that, for each inversion (i, k) where i > k but i 
precedes k in o, there is a pair (i*,j*) such that 
i* < k* and oti") > o( j*) (1) 
and vice versa. Thus, o is even or odd according to whether there is an even or an odd number of 
pairs satisfying (1). 
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8.19. 


8.20. 


8.21. 


Choose i* and k* so that o(i*) =i and o(k*) =k. Then i> k if and only if o(i*) > o(k*), and i 
precedes k in o if and only if i* < k*. 


Consider the polynomials g = g(x,,...,x,) and o(g), defined by 
E= g(x, oraz sn) = IG: ~~ x;) and a(g) = Heo E Xo(j)) 
i<j IJ 
(See Problem 8.14.) Show that o(g) = g when a is an even permutation, and o(g) = —g when a is 
an odd permutation. That is, o(g) = (sgn o)g. 
Because o is one-to-one and onto, 
a(g) = Hao — Xoj) = 5 Th Qi — x;) 
i<j i<j or i>j 
Thus, o(g) or o(g) = —g according to whether there is an even or an odd number of terms of the form 
x; — X;, Where i > j. Note that for each pair (i,j) for which 


i<j and ali) > o(J) 


there is a term (x,(, — x, ;)) in a(g) for which o(7) > o( j). Because ø is even if and only if there is an even 
number of pairs satisfying (1), we have o(g) = g if and only if ø is even. Hence, o(g) = —g if and only if c 
is odd. 


Let o,t E€ S,. Show that sgn(t° o) = (sgn t)(sgn o). Thus, the product of two even or two odd 
permutations is even, and the product of an odd and an even permutation is odd. 
Using Problem 8.19, we have 
sgn(t° a) g = (t° a)(g) = t(a(g)) = t((sgn o)g) = (sgn t)(sgn o)g 
Accordingly, sgn (t° o) = (sgn t)(sgn a). 
Consider the permutation o =/,j,---j,. Show that sgno~!'—=sgno and, for scalars aij, 
show that 


41,2 °° Gi, 


n — 91k, 92k, °° Ank, 


where o | = kika -+ + k. 


We have o~! o g = e, the identity permutation. Because € is even, o~! and ø are both even or both odd. 


Hence sgn o™! = sgn øv. 


Because o = jijz + 'Ja is a permutation, a; 14;,2 +++ ajn = Ak 42%, `° Ang, Then ki, ky,...,k, have the 
property that 


o(k,) =1, olk) = 2, wey o(k,) =n 
Let t = kjky---k,. Then, fori =1,...,n, 
(a° 2)(i) = a(t(i)) = a(k) = i 


Thus, go t = e, the identity permutation. Hence, tT = a’. 


Proofs of Theorems 


8.22. 


Prove Theorem 8.1: |A7| = |A]. 
If A = [a;], then AT = [b;], with b; = aj. Hence, 


|a" = » (sgn 7) by 5 (1) P26(2) oe Pno(n) = x (sgn F)Ag(1) 145(2),2 a as(n),n 


ES, oS, 


Let t = o™!. By Problem 8.21 sgn t = sgn a, and 5(1),1%5(2),2 `` ` a(n) yn = 411(1)421(2) ` ` * Ane(n)- Hence, 


|A"| = > (sgn T)Q11(1)422(2) ++ Ant(n) 


oc, 
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However, as ø runs through all the elements of S,,,t = a7! also runs through all the elements of S,,. Thus, 
|4"| = |4l. 


8.23. Prove Theorem 8.3(i): If two rows (columns) of A are interchanged, then |B| = —|A]. 


We prove the theorem for the case that two columns are interchanged. Let t be the transposition that 
interchanges the two numbers corresponding to the two columns of A that are interchanged. If A = [a,] and 
B = [b;], then b; = ang). Hence, for any permutation ø, 


b16(1)O26(2) *** Ona(n) = (e @)(1)42(r° 0)(2) °° Anfo o)(n) 

Thus, 
|B| = 2 (sgn o)B15(1)20(2) *** Bno(n) = 2 (sgn 0) 41 (¢0.5)(1)42(z0 «)(2) *** In(zo o)(n) 
OES, OSS), 

Because the transposition t is an odd permutation, sgn(t° o) = (sgn t)(sgn o) = —sgno. Accordingly, 
sgn o = —sgn (t° a), and so 

|B| =- > [sgn(1 ° 6)]ai(ro )(1)42(x2 0)(2) ` ` An(to o)(n) 

oes, 


But as ø runs through all the elements of S,,,7° c also runs through all the elements of S,,. Hence, |B| = —|A|. 


n? 


8.24. Prove Theorem 8.2. 
(i) If A has a row (column) of zeros, then |A| = 0. 


(ii) If A has two identical rows (columns), then |A| = 0. 

(iii) If A is triangular, then |A| = product of diagonal elements. Thus, |/| = 1. 

(i) Each term in |A| contains a factor from every row, and so from the row of zeros. Thus, each term of |A| 
is zero, and so |A| = 0. 

(ii) Suppose 1 + 1 4 0 in K. If we interchange the two identical rows of A, we still obtain the matrix A. 
Hence, by Problem 8.23, |A| = —|A], and so |A| = 0. 

Now suppose 1+ 1 = 0 in K. Then sgno = 1 for every o € S,,. Because A has two identical 
rows, we can arrange the terms of A into pairs of equal terms. Because each pair is 0, the determinant 
of A is zero. 

(iii) Suppose A = [a;] is lower triangular; that is, the entries above the diagonal are all zero: a; = 0 
whenever i < j. Consider a term ¢ of the determinant of A: 


t = (Sgn 0) ay; Azi, ° Ani, where 6 =ii)---i, 


Suppose i; # 1. Then 1 <i, and so a;;, = 0; hence, ¢ = 0. That is, each term for which i, # 1 is 
zero. 

Now suppose 7, = | but i, Æ 2. Then 2 < i,, and so ay; = 0; hence, t = 0. Thus, each term 
for which i, Æ 1 or i Æ 2 is zero. 

Similarly, we obtain that each term for which i, # 1 or ij #2or... ori, An is zero. 
Accordingly, |A| = a),4))--+d,, = product of diagonal elements. 


8.25. Prove Theorem 8.3: B is obtained from A by an elementary operation. 


(i) If two rows (columns) of A were interchanged, then |B| = —|A]. 

(ii) Ifa row (column) of A were multiplied by a scalar k, then |B| = k|A]. 

(iii) Ifa multiple of a row (column) of A were added to another row (column) of A, then |B| = |A]. 
(i) This result was proved in Problem 8.23. 

(ii) If the jth row of A is multiplied by k, then every term in |A| is multiplied by k, and so |B| = k|A|. That is, 


|B| = © (sgn )ay;,49;, we (ka;,) Any = k5 (sgn o) a, o Ani, = k\A| 


oO 
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8.26. 


8.27. 


8.28. 


8.29. 


8.30. 


8.31. 


8.32. 


(iii) Suppose c times the kth row is added to the jth row of A. Using the symbol ^ to denote the jth position 
in a determinant term, we have 


|B| = 5 (sgn aJar, da, + (Cam, + ap) + -an 
oO 


n 


=c} (sgn G)ay;,49;, af Di, yi T > (sgn G)ay;,49;, Ai An; 
oO 


oO 


The first sum is the determinant of a matrix whose kth and jth rows are identical. Accordingly, by 
Theorem 8.2(ii), the sum is zero. The second sum is the determinant of A. Thus, |B| = c- 0+ |A| = |A]. 
Prove Lemma 8.6: Let E be an elementary matrix. Then |£A| = |£||A|. 


Consider the elementary row operations: (i) Multiply a row by a constant k Æ 0, 
(ii) Interchange two rows, (iii) Adda multiple of one row to another. 


Let E,,£,£3 be the corresponding elementary matrices That is, E}, E2, E, are obtained by applying the 
above operations to the identity matrix 7. By Problem 8.25, 


|E,| = kU| = k, Ex] = -| =—1, |E3| = |Z] = 1 


Recall (Theorem 3.11) that £;A is identical to the matrix obtained by applying the corresponding operation 
to A. Thus, by Theorem 8.3, we obtain the following which proves our lemma: 


|E,A| = kļA| = |E; |lá], |E24| = —|A| = |£2|A], |E3A| = |4| = 1|4| = |£;3||A| 


Suppose B is row equivalent to a square matrix A. Prove that |B| = 0 if and only if |A| = 0. 
By Theorem 8.3, the effect of an elementary row operation is to change the sign of the determinant or to 
multiply the determinant by a nonzero scalar. Hence, |B| = 0 if and only if |4| = 0. 
Prove Theorem 8.5: Let A be an n-square matrix. Then the following are equivalent: 
(i) A is invertible, (ii) AX = 0 has only the zero solution, (iii) det(A) 4 0. 


The proof is by the Gaussian algorithm. If A is invertible, it is row equivalent to Z. But |Z| 4 0. Hence, 
by Problem 8.27, |A| 4 0. If A is not invertible, it is row equivalent to a matrix with a zero row. Hence, 
det(A) = 0. Thus, (i) and (iii) are equivalent. 

If AX = 0 has only the solution X = 0, then A is row equivalent to Z and A is invertible. Conversely, if 
A is invertible with inverse A~!, then 


X = IX = (A'A)X = A`! (4X) =A 0 = 0 
is the only solution of AX = 0. Thus, (i) and (ii) are equivalent. 
Prove Theorem 8.4: |AB| = |A||B|. 


If A is singular, then AB is also singular, and so |AB| = 0 = |A||B|. On the other hand, if A is 
nonsingular, then A = E,,---£,£,, a product of elementary matrices. Then, Lemma 8.6 and induction yields 


|4B| = |E, --- EE) Bl = |E,|---|E5||E,||B| = |4||B| 


Suppose P is invertible. Prove that |P~'| = |P|_’. 


P-'p = I. Hence, 1 = |Z| = |P~'P| = |P7}||P|, and so |P} = |P|'. 


Prove Theorem 8.7: Suppose A and B are similar matrices. Then |A| = |B]. 

Because A and B are similar, there exists an invertible matrix P such that B = P~'AP. Therefore, using 
Problem 8.30, we get |B| = |P~!AP| = |P~"||A||P| = |A||P! ||P = JA]. 

We remark that although the matrices P~! and A may not commute, their determinants |P~'| and |A| do 
commute, because they are scalars in the field K 


Prove Theorem 8.8 (Laplace): Let A = [a,j], and let A; denote the cofactor of a;;. Then, for any i or j 


intin nj tnj 
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8.33. 


8.34. 


Because |A| = |47|, we need only prove one of the expansions, say, the first one in terms of rows of A. 
Each term in |A| contains one and only one entry of the ith row (4@;,,4j,...,4;,) of A. Hence, we can write 
|A| in the form 


|A| = anA + anA% +--+ anA} 


In’ In 


(Note that A} is a sum of terms involving no entry of the ith row of A.) Thus, the theorem is proved if we can 
show that 


where M; is the matrix obtained by deleting the row and column containing the entry a,;. (Historically, the 
expression A¥, was defined as the cofactor of a, and so the theorem reduces to showing that the two 
definitions of the cofactor are equivalent.) 


First we consider the case that i = n, j = n. Then the sum of terms in |A| containing a,,, is 


Any Ain = Ann (sgn F)15(1)426(2) “°° An—1,o(n—1) 


where we sum over all permutations o € S, for which o(n) = n. However, this is equivalent (Prove!) to 
summing over all permutations of {1,...,n — 1}. Thus, A*, = |M,,,| = (= 1)" |Ma] 

Now we consider any i and j. We interchange the ith row with each succeeding row until it is last, and 
we interchange the jth column with each succeeding column until it is last. Note that the determinant |M;,| is 
not affected, because the relative positions of the other rows and columns are not affected by these 
interchanges. However, the ‘‘sign’’ of |A| and of A% is changed n — 1 and then n — j times. Accordingly, 


At = t= Tes M, 


i+} 
j = (-1) "|M; 


Let A = [a;;] and let B be the matrix obtained from A by replacing the ith row of 4 by the row 
vector (b;,,...,D;,). Show that 


|B| = by Ay + bpAp +i + binAin 
Furthermore, show that, for j Æ i, 


aA +a Ái +e +a A,, =0 and ajii + azjÁni spose ob OA ns =0 


jn in 


Let B = [b;]. By Theorem 8.8, 
|B| = bi By + boBn ++: + binBin 
Because B; does not depend on the ith row of B, we get B; = A; for j = 1,...,n. Hence, 
|B| = by Ain F bnAn Perta binAin 


Now let A’ be obtained from A by replacing the ith row of A by the jth row of A. Because A’ has two 
identical rows, |A’| = 0. Thus, by the above result, 


|4’| = = ay A; il +a t2Ái2 aeset ajnAin =0 
Using |47| = |A|, we also obtain that a);4; + azjAzi +: ++ + apjAni = 0. 


Prove Theorem 8.9: A(adj A) = (adj A)A = |AĻI. 
Let A = [a,] and let A(adj A) = [b;]. The ith row of A is 
(dii, ain,- ain) (1) 


Because adj A is the transpose of the matrix of cofactors, the jth column of adj A is the tranpose of the 
cofactors of the jth row of A: 


(A, Ajn- 34n) (2) 


Now 5b,,, the ij entry in A(adj A), is obtained by multiplying expressions (1) and (2): 


ij? 
bj = aj Aj T iA j2 Retai AinA jn 
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8.35. 


8.36. 


8.37. 


By Theorem 8.8 and Problem 8.33, 


Accordingly, A(adj A) is the diagonal matrix with each diagonal element |A|. In other words, 
A(adj A) = |A|I. Similarly, (adj 4)A = |All. 


Prove Theorem 8.10 (Cramer’s rule): The (square) system AX = B has a unique solution if and 
only if D # 0. In this case, x; = N;/D for each i. 


By previous results, AX = B has a unique solution if and only if A is invertible, and A is invertible if and 
only if D = |A| £0. 
Now suppose D Æ 0. By Theorem 8.9, A~! = (1/D)(adj A). Multiplying AX = B by A7!, we obtain 
X = A"'AX = (1/D)(adj A)B (1) 
Note that the ith row of (1/D)(adj A) is (1/D)(A,;,4;,-.-,Ani). If B = (b1, b,.-.,,)’, then, by (1), 
x; = (1/D)(b,Aq; + 5249; + ++ + By Ani) 


However, as in Problem 8.33, b,A,; + 574); + ---+5,A,; = N;, the determinant of the matrix obtained by 


replacing the ith column of A by the column vector B. Thus, x; = (1/D)N,, as required. 


Prove Theorem 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks 
A,,A2,...,A,. Then 
det(M) = det(A,) det(A) --- det(4,) 
We need only prove the theorem for n = 2—that is, when M is a square matrix of the form 


AC 


=|‘ B 


| . The proof of the general theorem follows easily by induction. 


Suppose A = [a;] is r-square, B = [b; 


y] is s-square, and M = [m,] is n-square, where n = r + s. By 
definition, 


det(M) = $ (sgn o)mis(1)Mz0(2) *** Mno(n) 


OES, 


If i >r andj < r, then m; = 0. Thus, we need only consider those permutations ¢ such that 
o{rt1,r+2,...,rts$={r4+1,r4+2,...,r+s} and o{1,2,...,r} = {1,2,...,r} 
Let cı (k) = a(k) for k <r, and let o,(k) = a(r +k) —r for k < s. Then 
(sgn M1 5(1)M29(2) t Mng(n) = (sgn 51) 415, (1) 420, (2) ~~ Gy, (r) (sgn 77)B19,(1)P 205 (2) se Bso,(s) 


which implies det(M) = det(4) det(B). 


Prove Theorem 8.14: There exists a unique function D : M — K such that 


(i) D is multilinear, (ii) D is alternating, (iii) D(I)= 1. 
This function D is the determinant function; that is, D(A) = |A]. 
Let D be the determinant function, D(A) = |A|. We must show that D satisfies (i), (ii), and (iii), and that 
D is the only function satisfying (i), (ii), and (iii). 
By Theorem 8.2, D satisfies (ii) and (iii). Hence, we show that it is multilinear. Suppose the ith row of 


A = [a;] has the form (ba + ¢), bin + Cn, ---, bin + Cmn) Then 
D(A) =D(Ay, a BEd 
== (sgn 0) Q19(1) "= @-1,0(7-1) (Diag + Cini) no(n) 
=2 (sgn o)a19(1) `t bioti) ++ Anola) + X (SBN G) Aq 6(1) Cioli)” Ano(n) 


n n 


= D(A,- Bonide pA NADA oc Coreg) 


t 
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Also, by Theorem 8.3(ii), 
D(A,,...,A; 


ies 


<, An) = kD(Ay,...,A;,---,An) 


Thus, D is multilinear—D satisfies (i). 


We next must prove the uniqueness of D. Suppose D satisfies (i), (ii), and (iii). If {e),..., 
usual basis of K”, then, by (iii), D(e,,e),...,e,) = DU) = 1. Using (ii), we also have that 


D(ei sens- 3€) = S80 0, where o=iib i, 
Now suppose A = [a;;]. Observe that the kth row A; of A is 
Ag = (ak1; k2» + ++ Akn) = Arey + Apea + +++ + Apren 


Thus, 


D(A) = Daye PrP Aen, A2181 Pes Anen, eea Ape br + Annen) 


Using the multilinearity of D, we can write D(A) as a sum of terms of the form 


D(A) = > Dla ei » 2; heee ety) 
E DDCA n “Ay; )D(e;, > €> <’ Gi ) 


where the sum is summed over all sequences i,i, . . . ip, where i, € {1,...,n}. If two of the indices are equal, 


say i; = i, but j #4, then, by (ii), 
D(e; ,@,,---,@ ) =0 


i)? “ig? a 


Accordingly, the sum in (2) need only be summed over all permutations o = iji,---i, 


have that 


D(A) = > (dii azi, PRS Ani, )D(E;, ringers €) 


oO 


=>} (sgn F) Ay; Ai, °° * Ani, 5 where 0 = Yn ly 
o 


Hence, D is the determinant function, and so the theorem is proved. 


SUPPLEMENTARY PROBLEMS 


Computation of Determinants 
8.38. Evaluate: 


2 6 5 1 —2 8 4 9 a+b a 
: t-4 3 t—1 4 
8.39. Find all ¢ such that (a) 2 1—9 | =0, (b) 3 1-2 | 0 
8.40. Compute the determinant of each of the following matrices: 
2 1 1 3 -2 =4 —2 =I 4 6 
(a) /0 5 -2], (b)}2 5 -1j, œ 6 —3 -2], (d) {1 2 
1-3 4 0 6 1 4 1 2 3 =2 


. Using (1), we finally 


288 


8.41. Find the determinant of each of the following matrices: 


i & 2 4 2 1 3 
1 0 -2 0 3 0 1 
Mls a 1 -2)/ ®ji -1 4 
4 3 0 2 I | 
8.42. Evaluate: 
2 -1 3 =f 2 -1 4 
2 1 -2 1 = 0 
j3 3 -5 al ©) 3 3 
5 2-1 4 { -2 2 


8.43. Evaluate each of the following determinants: 


i 2 =i 3 4 1 
Saf je 3 2 

(a) | 3 1 0 2 -1}, œ l0 
5 1 2 3 4 0 

=y fa 1 = 0 


Cofactors, Classical Adjoints, Inverses 
8.44. Find det(A), adj A, and 4~!, where 


110 12 2 
(a) 4=|1 1 1], (&) 4=13 10 
021 1 11 


Soo 
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8.45. Find the classical adjoint of each matrix in Problem 8.41. 
8.46. Let A= f y 

c d 
8.47. 


8.48. Suppose A = [a,] is triangular. Show that 


Show that if A is diagonal (triangular) then adj A is diagonal (triangular). 


(a) A is invertible if and only if each diagonal element a; 4 0. 


(b) The diagonal elements of A~! (if it exists) are a 


Minors, Principal Minors 


1 2 3 2 1 
1 0 -2 2 
8.49. Let A = 3 2 5 and B= 0 
4 -3 0 -l 3 


corresponding to the following submatrices: 


(a) A(1,4; 3,4), (b) B(1,4; 3,4), © 


ii 


A(2, 3; 


1 


> 


the reciprocals of the diagonal elements of A. 


5 
4 
1 
—2 


2,4), (d) B(2,3; 


8.50. For k = 1,2,3, find the sum S, of all principal minors of order k for 


i 32 1 5 
(a) A4=|2 -4 3|, b B=]2 6 
5 2 1 3 2 


—4 
1 >’ 
0 


1 =a 
(c) C=]2 1 
a aa 


2 
=) 
3 
1 
1 -2 3 -1 
2 1 1 -2 0 
ab ©l2 o 4 5 
1 4 4 -6 
579 12 3 
242 5 4 3 
12 3, © |0 0 6 
5 6 2 0 0 0 
2 3 1 00 0 


NNUNA 


Ww A meme 


| (a) Find adj A, (b) Show that adj(adj A) =A, (c) When does 4 = adj A? 


. Find the minor and the signed minor 


2,4). 


11 
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8.51. For k = 1,2,3,4, find the sum S, of all principal minors of order k for 


1 2 3 -l 1212 
12 0 5 0123 
@ á=] i2 gle ® Bly 304 
4 0 -1 -3 2745 


Determinants and Linear Equations 


8.52. Solve the following systems by determinants: 


3x + Sy = 8 2x —-3y=-1 ax — 2by =c 
(a) ae ay (b) a (0) peal (ab # 0) 


8.53. Solve the following systems by determinants: 


2x — 5y+2z=2 2z+3=y+3x 
(a) x+2y—4z=5, (b) x—3z=2y4+ 1 
3x —4y-—6z=1 3y+z=2-—2x 


8.54. Prove Theorem 8.11: The system AX = 0 has a nonzero solution if and only if D = |A| = 0. 


Permutations 
8.55. Find the parity of the permutations o = 32154, t = 13524, n = 42531 in Ss. 


8.56. For the permutations in Problem 8.55, find 
(a) toa, (b) mea, (e) o!, (d) t. 
8.57. Let ct € S,. Show that to ø runs through S, as o runs through S,, that is, S, = {t°g:a€S,}. 


8.58. Let o € S, have the property that a(n) = n. Let o* € S,_, be defined by o*(x) = a(x). 
(a) Show that sgn o* = sgn ø, 
(b) Show that as o runs through S,,, where o(n) = n, o* runs through S„_;; that is, 
S,-1 = {0* : o €S,,0(n) =n}. 
8.59. Consider a permutation o = j, j,...j,. Let {e;} be the usual basis of K”, and let A be the matrix whose ith 
row is e, [i.e., A = (e;,, &,,---,6,)]. Show that |A| = sgn ø. 
Determinant of Linear Operators 
8.60. Find the determinant of each of the following linear transformations: 
(a) T:R? > R? defined by T(x,y) = (2x — 9y, 3x —5y), 
(b) T:R? — R? defined by T(x, y,z 
(c) T:R? —> R? defined by T(x,y,z 


(3x— 2z, 5y+7z, x+y+z2), 
(2x + 7y — 4z, 4x — 6y + 2z). 


) = 

) = 

8.61. Let D:V — V be the differential operator; that is, D( f(t)) = df /dt. Find det(D) if V is the vector space of 
functions with the following bases: (a) {1,t,...,°}, (b) {ef e”, e}, (c) {sint, cost}. 


8.62. Prove Theorem 8.13: Let F and G be linear operators on a vector space V. Then 


(i) det(F° G) = det(F) det(G), (ii) F is invertible if and only if det(F) 4 0. 


=1 


8.63. Prove (a) det(1,) = 1, where 1, is the identity operator, (b) -det(7~!) = det(T)~° when T is invertible. 
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Miscellaneous Problems 


8.64. 


8.65. 


8.66. 


8.67. 


8.68. 


8.69. 


8.70. 


8.71. 


8.72. 


8.73. 


Find the volume V (S) of the parallelopiped S in R? determined by the following vectors: 


(a) u= (1,2,=3); uy = (3,4, —1), uz = (2,=1,5), 
(b) u = (1,1,3), u = (1, —2,—4), u; = (4,1,5). 


Find the volume V (S) of the parallelepiped S in R* determined by the following vectors: 
uy = (1,-2,5,-1), Uy = (2, 1,—2, 1), uz = (3,0, t= 2), u4 = (1,-1,4,-1) 


Let V be the space of 2 x 2 matrices M = k A over R. Determine whether D:V — R is 2-linear (with 
respect to the rows), where a 


) D(M) = ac — bd, (e) DM) 
, (d) D(M) = ab — cd, (£) D(M) = 


Let A be an n-square matrix. Prove |kA| = k”|A|. 


C D 
that |M| = |A||D| — |B||C|. Show that the result may not be true if the matrices do not commute. 


Let A, B, C, D be commuting n-square matrices. Consider the 2n-square block matrix M = < P . Prove 


Suppose A is orthogonal; that is, A7A = I. Show that det(A) = +1. 


Let V be the space of m-square matrices viewed as m-tuples of row vectors. Suppose D:V — K is m-linear 
and alternating. Show that 


(a) D(...,A4,...,B,...) = —D(...,B,...,A,...); sign changed when two rows are interchanged. 
(b) If A,,A,...,A,, are linearly dependent, then D(A, 4),...,4,,) = 0. 


Let V be the space of m-square matrices (as above), and suppose D: V — K. Show that the following weaker 
statement is equivalent to D being alternating: 


D(A,,4),...,4,) = 0 whenever A; = A,,, for some i 


Let V be the space of n-square matrices over K. Suppose B € V is invertible and so det(B) # 0. Define 
D:V — K by D(A) = det(AB) /det(B), where A € V. Hence, 


D(A), 42, save ,A,) = det(A,B, A,B, oot A,B) /det(B) 


where A; is the ith row of A, and so A,B is the ith row of AB. Show that D is multilinear and alternating, and 
that D(I) = 1. (This method is used by some texts to prove that |AB| = |A||B|.) 


Show that g = g(x,,..-,x,) = (—1)"V,,_;(x) where g = g(x;) is the difference product in Problem 8.19, 
x =x,, and V, is the Vandermonde determinant defined by 


n’ n 


1 1 1 1 
xy Xp Xp x 
= 2 2 2 2 
Vir) = | xy X2 Xn- X 
n—1 n—1 n—-1 n—-1 
x} X2 Xn-1 X 


Let A be any matrix. Show that the signs of a minor A[|Z,J] and its complementary minor A[Z',J'] are 
equal. 
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8.74. Let A be an n-square matrix. The determinantal rank of A is the order of the largest square submatrix of A 
(obtained by deleting rows and columns of A) whose determinant is not zero. Show that the determinantal 
rank of A is equal to its rank—the maximum number of linearly independent rows (or columns). 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M=|R,; R,; ...] denotes a matrix with rows R,,R,.... 

8.38. (a) —22, (b) —13, (c) 46, (d) —21, (e) a +ab+bh 

8.39. (a) 3,10; (b) 5,-2 

8.40. (a) 21, (b) —-11, (c) 100, (d) 0 

8.41. (a) —131, (b) -55 

8.42. (a) 33, (b) 0, (c) 45 

8.43. (a) —32, (b) —14, (c) —468 

8.44, (a) |A| = —2, adj A = [-1,-1,1; 1,1,-1; 2,-—2,0], 
(b) |A| = -1, adj A = [1,0,—2; —3,—1,6; 2,1,—5]. Also, A~! = (adj 4)/|A| 

8.45. (a) [—16,—29, —26, —2; 30, —38, — 16, 29; 8,51, -13,-1; 13, 1,28, —18], 
(b) [21,—14,—17,—19; —44, 11,33,11; —29,1, 13,21; 17,7, —19, —18] 

8.46. (a) adj A = [d,—b; —c,al, (c) A=AI 

8.49. (a) —3,-3, (b) —23,-—23, (c) 3,-3, (d) 17,-17 

8.50. (a) —2,-—17,73, (b) 7,10,105, (c) 13,54,0 

8.51. (a) —6, 13,62, —219; (b) 7, —37, 30,20 

8.52. @) x=¥,y=3% ©) x=-HY=H © x=-L,y=-F 

8.53. (a) x=5,y=2,z=1, (b) Because D = 0, the system cannot be solved by determinants. 

8.55. (a) sgno=1,sgnt=—Il,sgnza=—l 

8.56. (a) tog = 53142, (b) noo = 52413, (c) o! = 32154, (d) 17! = 14253 

8.60. (a) det(T) = 17, (b) det(T) = 4, (c) not defined 

8.61. (a) 0, (b) 6, (c) 1 

8.64. (a) 18, (b) 0 

8.65. 17 

8.66. (a) no, (b) yes, (c) yes, (d) no, (e) yes, (£) no 


CHAPTER 9 


Diagonalization: 
Eigenvalues and Eigenvectors 


9.1 Introduction 


The ideas in this chapter can be discussed from two points of view. 


Matrix Point of View 


Suppose an v-square matrix A is given. The matrix A is said to be diagonalizable if there exists a 
nonsingular matrix P such that 


B = P'AP 


is diagonal. This chapter discusses the diagonalization of a matrix A. In particular, an algorithm is given 
to find the matrix P when it exists. 


Linear Operator Point of View 


Suppose a linear operator T: V — V is given. The linear operator T is said to be diagonalizable if there 
exists a basis S of V such that the matrix representation of T relative to the basis S is a diagonal matrix D. 
This chapter discusses conditions under which the linear operator T is diagonalizable. 


Equivalence of the Two Points of View 


The above two concepts are essentially the same. Specifically, a square matrix A may be viewed as a 
linear operator F defined by 


F(X) = AX 
where X is a column vector, and B = P~'AP represents F relative to a new coordinate system (basis) 


S whose elements are the columns of P. On the other hand, any linear operator T can be represented by a 
matrix A relative to one basis and, when a second basis is chosen, 7 is represented by the matrix 


B = P'AP 


where P is the change-of-basis matrix. 

Most theorems will be stated in two ways: one in terms of matrices A and again in terms of linear 
mappings T. 
Role of Underlying Field K 


The underlying number field K did not play any special role in our previous discussions on vector spaces 
and linear mappings. However, the diagonalization of a matrix A or a linear operator T will depend on the 


ED — 
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roots of a polynomial A(t) over K, and these roots do depend on K. For example, suppose A(t) = ¢ + 1. 
Then A(t) has no roots if K = R, the real field; but A(t) has roots +i if K = C, the complex field. 
Furthermore, finding the roots of a polynomial with degree greater than two is a subject unto itself 
(frequently discussed in numerical analysis courses). Accordingly, our examples will usually lead to 
those polynomials A(t) whose roots can be easily determined. 


9.2 Polynomials of Matrices 


Consider a polynomial f(t) = a,t" +---+a,t-+ dp over a field K. Recall (Section 2.8) that if A is any 
square matrix, then we define 


f(A) =a,A" +- -+a A + al 


where Z is the identity matrix. In particular, we say that A is a root of f(t) if f(A) = 0, the zero matrix. 


7 10 


EXAMPLE 9.1 Let 4 = | 15 22 


1 2 2 

3 i| Then 4 -| |. re: 
f(t) =2f-3t+5 and g(t) =ť-5t-2 

Then 


f(A) = 24? ~ 34 +51 = |3 aa] + [s —6] , [5 $= (38 à 


30 44|" |-9 -12| [0 5 


and 


= ej ee = |7 10 -5 —10] [-2 o] fo 0 
ee =| 5 ee ~20| 7] 0 l=] | 


Thus, A is a zero of g(t). 
The following theorem (proved in Problem 9.7) applies. 


THEOREM 9.1: Let f and g be polynomials. For any square matrix A and scalar k, 
© (f+g)(4)=f(4)+e(4) GD (WA) = (A) 
GD (fg)(A) = f(A)g(A) (iv) f(A)g(A) = (4) f(A). 


Observe that (iv) tells us that any two polynomials in A commute. 


Matrices and Linear Operators 


Now suppose that T: V — V is a linear operator on a vector space V. Powers of T are defined by the 
composition operation: 


T=ToT, Pare, 


Also, for any polynomial f(t) = a,t" + +--+ a,;t+ do, we define f(T) in the same way as we did for 
matrices: 

F(T) =4,T" +--+ +a,T + aol 
where J is now the identity mapping. We also say that T is a zero or root of f(t) if f(T) = 0, the zero 


mapping. We note that the relations in Theorem 9.1 hold for linear operators as they do for matrices. 


Remark: Suppose 4 is a matrix representation of a linear operator T. Then f(A) is the matrix 
representation of f (T), and, in particular, f(T) = 0 if and only if f(A) = 0. 
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9.3 Characteristic Polynomial, Cayley-Hamilton Theorem 


Let A = [a,j] be an n-square matrix. The matrix M = A — tl,, where J, is the m-square identity matrix and 
t is an indeterminate, may be obtained by subtracting t down the diagonal of A. The negative of M is the 
matrix tI, — A, and its determinant 


A(t) = det(tl, — A) = (-1)” det(A — tI) 


which is a polynomial in ¢ of degree n and is called the characteristic polynomial of A. 
We state an important theorem in linear algebra (proved in Problem 9.8). 


THEOREM 9.2: (Cayley—Hamilton) Every matrix A is a root of its characteristic polynomial. 


Remark: Suppose 4 = [a,j] is a triangular matrix. Then ¢/ — A is a triangular matrix with diagonal 
entries ¢ — a;;; hence, 


A(t) = det(tl — A) = (t — a1) (t — az) ++ (t — Aan) 


Observe that the roots of A(t) are the diagonal elements of A. 


EXAMPLE 9.2 Let 4 = li . Its characteristic polynomial is 
t—1 —3 
AQ)=|¢-Al=|" 4 4g = (¢=1)@=—5)-12 =f — 6-7 


As expected from the Cayley—Hamilton theorem, A is a root of A(t); that is, 


itt eg oie | 18 18), | 26 18) fe OT feo 
oe as w= | se les 30] + | 0 e j 


Now suppose 4 and B are similar matrices, say B = P~'AP, where P is invertible. We show that A 
and B have the same characteristic polynomial. Using tI = P~'tIP, we have 
A,(t) = det(tI — B) = det(tI — P'AP) = det(P~'tIP — P'AP) 
= det[P™ (tI — A)P] = det(P™!) det(tI — A) det(P) 
Using the fact that determinants are scalars and commute and that det(P~') det(P) = 1, we finally obtain 
Ag(t) = det(tI — A) = A(t) 
Thus, we have proved the following theorem. 


THEOREM 9.3: Similar matrices have the same characteristic polynomial. 


Characteristic Polynomials of Degrees 2 and 3 


There are simple formulas for the characteristic polynomials of matrices of orders 2 and 3. 


(a) Suppose A = F k Then 
a2 an 


A(t) = Ë — (ay; + aņ)t + det(A) = Ê — tr(A) t + det(A) 
Here tr(A) denotes the trace of A—that is, the sum of the diagonal elements of A. 
Aii 42 413 
(b) Suppose A= Ay, An 43]. Then 
431 @32 433 


A(t) =F — tr(4) Ê + (411 +42 + 433)t — det(A) 


(Here A11, 422, 433 denote, respectively, the cofactors of a11, a22, a33.) 
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EXAMPLE 9.3 Find the characteristic polynomial of each of the following matrices: 


oap aop Joek 


(a) We have tr(4) = 5+ 10 = 15 and |A| = 50 — 6 = 44; hence, A(t) + £ — 15t + 44. 
(b) We have tr(B) = 7 +2 = 9 and |B| = 14 + 6 = 20; hence, A(t) = Ê — 9t + 20. 
(c) We have tr(C) = 5 — 4 = 1 and |C| = —20 + 8 = —12; hence, A(t) = ? — t — 12. 


1 1 2 
EXAMPLE 9.4 Find the characteristic polynomial of A = f 3 | i 
1 3 9 


We have tr(4) = 1 +3 +9 = 13. The cofactors of the diagonal elements are as follows: 


1 1 


0 3 


1 
= 21, An =| 9 


Thus, 4,; + Ay) + 433 = 31. Also, |4| = 27 +2 + 0 — 6 — 6 — 0 = 17. Accordingly, 
A(t) = Ê — 13 + 31t—17 


Remark: The coefficients of the characteristic polynomial A(t) of the 3-square matrix A are, with 
alternating signs, as follows: 


Sı = tr(A), Sy = Ay) + Ag) + A33, S3 = det(A) 
We note that each S, is the sum of all principal minors of A of order k. 


The next theorem, whose proof lies beyond the scope of this text, tells us that this result is true in 
general. 


THEOREM 9.4: Let A be an n-square matrix. Then its characteristic polynomial is 


Alt) = # Hs HS H e H LS, 


where S, is the sum of the principal minors of order k. 


Characteristic Polynomial of a Linear Operator 


Now suppose 7: V — V is a linear operator on a vector space V of finite dimension. We define the 
characteristic polynomial A(t) of T to be the characteristic polynomial of any matrix representation of T. 
Recall that if A and B are matrix representations of T, then B = P~'AP, where P is a change-of-basis 
matrix. Thus, A and B are similar, and by Theorem 9.3, A and B have the same characteristic polynomial. 
Accordingly, the characteristic polynomial of T is independent of the particular basis in which the matrix 
representation of T is computed. 

Because f(T) = 0 if and only if f(A) =0, where f(t) is any polynomial and 4 is any matrix 
representation of T, we have the following analogous theorem for linear operators. 


THEOREM 9.2’: | (Cayley—Hamilton) A linear operator T is a zero of its characteristic polynomial. 
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9.4 Diagonalization, Eigenvalues and Eigenvectors 


Let A be any n-square matrix. Then A can be represented by (or is similar to) a diagonal matrix 


D = diag(k,,k>,...,k,) if and only if there exists a basis S consisting of (column) vectors u1,U,...,Uy, 
such that 

Au, = ku 

Auy = knuz 

Au, = k,n 


In such a case, A is said to be diagonizable. Furthermore, D = P7!AP, where P is the nonsingular matrix 
whose columns are, respectively, the basis vectors u4, tz, ...,u 


The above observation leads us to the following definition. 


DEFINITION: Let A be any square matrix. A scalar / is called an eigenvalue of A if there exists a 
nonzero (column) vector v such that 


Av = Àv 


Any vector satisfying this relation is called an eigenvector of A belonging to the 
eigenvalue J. 
We note that each scalar multiple kv of an eigenvector v belonging to / is also such an eigenvector, 


because 


A(kv) = k(Av) = k(Av) = A(kv) 


The set Æ, of all such eigenvectors is a subspace of V (Problem 9.19), called the eigenspace of å. (If 
dim £, = 1, then £; is called an eigenline and 2 is called a scaling factor.) 


The terms characteristic value and characteristic vector (or proper value and proper vector) are 
sometimes used instead of eigenvalue and eigenvector. 
The above observation and definitions give us the following theorem. 


THEOREM 9.5: An n-square matrix A is similar to a diagonal matrix D if and only if A has n linearly 
independent eigenvectors. In this case, the diagonal elements of D are the corresponding 
eigenvalues and D = P~! AP, where P is the matrix whose columns are the eigenvectors. 


Suppose a matrix A can be diagonalized as above, say P~-'AP = D, where D is diagonal. Then A has 
the extremely useful diagonal factorization: 


A = PDP! 


Using this factorization, the algebra of A reduces to the algebra of the diagonal matrix D, which can be 
easily calculated. Specifically, suppose D = diag(k,,k,...,4,). Then 


A” = (PDP"')" = PDP"! = P diag(k”,...,#")P! 
More generally, for any polynomial f (t), 

f(A) =f (PDP) = Pf(D)P~' = P diag( f (ki), f (ka), -- -3f (Kn) ) P 
Furthermore, if the diagonal entries of D are nonnegative, let 


B = P diag(./k,, Vka, -~ , /k,) P`! 


Then B is a nonnegative square root of A; that is, B? = A and the eigenvalues of B are nonnegative. 
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EXAMPLE 9.5 Let 4 = [i | and let v; = E and v, = Hi Then 


mg Jae eB e 


Thus, v; and v, are eigenvectors of A belonging, respectively, to the eigenvalues 2, = 1 and A, = 4. Observe that v, 
and v, are linearly independent and hence form a basis of R°. Accordingly, A is diagonalizable. Furthermore, let P 
be the matrix whose columns are the eigenvectors v; and v,. That is, let 


P= e | and so Piz] 


—2 1 


WILD w= 
La] w= 
—EE | 


Then 4 is similar to the diagonal matrix 


Ie alla be 


As expected, the diagonal elements | and 4 in D are the eigenvalues corresponding, respectively, to the eigenvectors 
vı and v, which are the columns of P. In particular, A has the factorization 


Rap 1 ılļlı olli -3 
-2 1/|0 4}]2 2 


Accordingly, 


e-f illo =l 


Moreover, suppose f(t) = P — 5 + 3t + 6; hence, f (1) = 5 and f (4) = 2. Then 


nori E I-E 


Last, we obtain a “‘positive square root’ of A. Specifically, using V1 = 1 and V4 = 2, we obtain the matrix 


-eoria ile A dE 


—2 1j[0 2 
Remark: Throughout this chapter, we use the following fact: 


Wop wi= 


D=P'AP= 


WIN WI 


WIN w= 
WI Wie 


where B? = A and where B has positive eigenvalues 1 and 2. 


b 


a 
If P= i d 


| then Pt = | d/|P| me 


—c/|P|_— a/|P| 


That is, P~! is obtained by interchanging the diagonal elements a and d of P, taking the negatives of the 
nondiagonal elements b and c, and dividing each element by the determinant |P]. 


Properties of Eigenvalues and Eigenvectors 


Example 9.5 indicates the advantages of a diagonal representation (factorization) of a square matrix. In 
the following theorem (proved in Problem 9.20), we list properties that help us to find such a 
representation. 
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THEOREM 9.6: Let A be a square matrix. Then the following are equivalent. 
(i) A scalar / is an eigenvalue of A. 
(ii) The matrix M = A — AI is singular. 
(iii) The scalar 4 is a root of the characteristic polynomial A(t) of A. 


The eigenspace Æ, of an eigenvalue / is the solution space of the homogeneous system MX = 0, 
where M = A — JI; that is, M is obtained by subtracting 2 down the diagonal of A. 

Some matrices have no eigenvalues and hence no eigenvectors. However, using Theorem 9.6 and the 
Fundamental Theorem of Algebra (every polynomial over the complex field C has a root), we obtain the 
following result. 


THEOREM 9.7: Let A be a square matrix over the complex field C. Then A has at least one eigenvalue. 


The following theorems will be used subsequently. (The theorem equivalent to Theorem 9.8 for linear 
operators is proved in Problem 9.21, and Theorem 9.9 is proved in Problem 9.22.) 


THEOREM 9.8: Suppose v4, v,...,U, are nonzero eigenvectors of a matrix A belonging to distinct 
eigenvalues 1,,2,,...,A,. Then v;, v2,...,U, are linearly independent. 


THEOREM 9.9: Suppose the characteristic polynomial A(t) of an n-square matrix A is a product of n 
distinct factors, say, A(t) = (t—a,)(t—a,)---(t—a,). Then A is similar to the 
diagonal matrix D = diag(a,,a),...,d,). 


If A is an eigenvalue of a matrix A, then the algebraic multiplicity of 2 is defined to be the multiplicity 
of À as a root of the characteristic polynomial of A, and the geometric multiplicity of À is defined to be the 
dimension of its eigenspace, dim E}. The following theorem (whose equivalent for linear operators is 
proved in Problem 9.23) holds. 


THEOREM 9.10: The geometric multiplicity of an eigenvalue 2 of a matrix A does not exceed its 
algebraic multiplicity. 


Diagonalization of Linear Operators 


Consider a linear operator T: V — V. Then T is said to be diagonalizable if it can be represented by a 


diagonal matrix D. Thus, T is diagonalizable if and only if there exists a basis S = {u,,u,...,u,} of V 
for which 

T(u,) = ku; 

T(u) = ku 

T (un) = kpin 


In such a case, T is represented by the diagonal matrix 
D = diag(k,,ky,...,k,) 


relative to the basis S. 
The above observation leads us to the following definitions and theorems, which are analogous to the 
definitions and theorems for matrices discussed above. 


DEFINITION: Let T be a linear operator. A scalar A is called an eigenvalue of T if there exists a 
nonzero vector v such that T(v) = Av. 
Every vector satisfying this relation is called an eigenvector of T belonging to the 
eigenvalue 2. 
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The set Æ, of all eigenvectors belonging to an eigenvalue 4 is a subspace of V, called the 
eigenspace of À. (Alternatively, 2 is an eigenvalue of T if AJ — T is singular, and, in this case, Æ; is the 
kernel of AJ — T.) The algebraic and geometric multiplicities of an eigenvalue À of a linear operator T are 
defined in the same way as those of an eigenvalue of a matrix A. 

The following theorems apply to a linear operator T on a vector space V of finite dimension. 


THEOREM 9.5': T can be represented by a diagonal matrix D if and only if there exists a basis S of V 
consisting of eigenvectors of 7. In this case, the diagonal elements of D are the 
corresponding eigenvalues. 


THEOREM 9.6’: Let T be a linear operator. Then the following are equivalent: 


(i) A scalar / is an eigenvalue of T. 
(ii) The linear operator AJ — T is singular. 
(iii) The scalar / is a root of the characteristic polynomial A(t) of T. 


THEOREM 9.7’: | Suppose V is a complex vector space. Then T has at least one eigenvalue. 


THEOREM 9.8’: Suppose v,,v2,.-., U, are nonzero eigenvectors of a linear operator T belonging to 
distinct eigenvalues 4,,4),...,4,. Then v,,v2,..., U, are linearly independent. 


THEOREM 9.9: Suppose the characteristic polynomial A(t) of T is a product of n distinct factors, say, 
A(t) = (t — a,)(t— a): --(t— a). Then T can be represented by the diagonal 
matrix D = diag(a,,d,...,d,). 

THEOREM 9.10: The geometric multiplicity of an eigenvalue 2 of T does not exceed its algebraic 

multiplicity. 


Remark: The following theorem reduces the investigation of the diagonalization of a linear 
operator T to the diagonalization of a matrix A. 


THEOREM 9.11: Suppose A is a matrix representation of T. Then T is diagonalizable if and only if A 
is diagonalizable. 


9.5 Computing Eigenvalues and Eigenvectors, Diagonalizing Matrices 


This section gives an algorithm for computing eigenvalues and eigenvectors for a given square matrix A 
and for determining whether or not a nonsingular matrix P exists such that P-'AP is diagonal. 


ALGORITHM 9.1: — (Diagonalization Algorithm) The input is an n-square matrix A. 
Step 1. Find the characteristic polynomial A(t) of A. 
Step 2. Find the roots of A(t) to obtain the eigenvalues of A. 


Step 3. Repeat (a) and (b) for each eigenvalue å of A. 
(a) Form the matrix M = A — AI by subtracting 1 down the diagonal of A. 


(b) Find a basis for the solution space of the homogeneous system MX = 0. (These basis 
vectors are linearly independent eigenvectors of A belonging to 1.) 
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Step 4. Consider the collection S = {v,, v2,..., Un} of all eigenvectors obtained in Step 3. 
(a) Ifm+#n, then A is not diagonalizable. 


(b) Ifm =n, then A is diagonalizable. Specifically, let P be the matrix whose columns are the 
eigenvectors V1, U2,...,U,- Then 


D = P'AP = dias Ay 22,- --, An) 


where 4; is the eigenvalue corresponding to the eigenvector v;. 


EXAMPLE 9.6 The diagonalizable algorithm is applied to 4 = E d ; 


3 -1 
(1) The characteristic polynomial A(t) of A is computed. We have 
tr(A) = 4-1 =-3, |A| = —4 — 6 = —10; 
hence, 


A@) = —3t— 10 = (t—5)(t+2) 


(2) Set A(t) = (t — 5)(t + 2) = 0. The roots 2; = 5 and A, = —2 are the eigenvalues of A. 
(3) (i) We find an eigenvector v; of A belonging to the eigenvalue 2, = 5. Subtract 2, = 5 down the diagonal of 


A to obtain the matrix M = E : 4 . The eigenvectors belonging to 4, = 5 form the solution of the 
homogeneous system MX = 0; that is, 
-1 2\\|x 0 —x+2y=0 
= or —x+2vy=0 
| 3 E H H = 3x —6y = 0 mores: 


The system has only one free variable. Thus, a nonzero solution, for example, vı = (2,1), is an 
eigenvector that spans the eigenspace of A; = 5. 

(ii) We find an eigenvector v, of A belonging to the eigenvalue 2, = —2. Subtract —2 (or add 2) down the 
diagonal of A to obtain the matrix 


| and the homogenous system a ae or 3x+y=0. 


“=|; l 3x+ y=0 


The system has only one independent solution. Thus, a nonzero solution, say v, = (—1,3), is an 
eigenvector that spans the eigenspace of A, = —2. 


(4) Let P be the matrix whose columns are the eigenvectors v; and v. Then 


= |2 =! hes 
Pa al and so P -Í 


IR Niv 
NSIN Ne 
| A | 


Accordingly, D = P~'AP is the diagonal matrix whose diagonal entries are the corresponding eigenvalues; 


that is, 
z 2 5/74 2772 -1 5 0 
D=P -AP = 5 = 
—| #113 -1JLl1 3 0 -2 
EXAMPLE 9.7 Consider the matrix B = i ~ i I We have 


tr(B)=5+3=8, |BJ=15+1=16; so A(t)=P—8t+16=(t—4) 


Accordingly, 2 = 4 is the only eigenvalue of B. 
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Subtract 2 = 4 down the diagonal of B to obtain the matrix 


-1 x-y=0 _ 
1 i and the homogeneous system ae or x-y=0 


M= | 
The system has only one independent solution; for example, x = 1,y = 1. Thus, v = (1, 1) and its multiples are the 
only eigenvectors of B. Accordingly, B is not diagonalizable, because there does not exist a basis consisting of 
eigenvectors of B. 


3-5 
2 =3 
A(t) = Ê + 1 is the characteristic polynomial of 4. We consider two cases: 


EXAMPLE 9.8 Consider the matrix A = | |! Here tr(4)=3-3=0 and |4| = —9 + 10 = 1. Thus, 


(a) A is a matrix over the real field R. Then A(t) has no (real) roots. Thus, 4 has no eigenvalues and no 
eigenvectors, and so A is not diagonalizable. 


(b) A is a matrix over the complex field C. Then A(t) = (t — i)(t+ i) has two roots, i and —i. Thus, A has two 
distinct eigenvalues i and —i, and hence, A has two independent eigenvectors. Accordingly there exists a 
nonsingular matrix P over the complex field C for which 


pea 0 
mps fi 2 


Therefore, Æ is diagonalizable (over C). 


9.6 Diagonalizing Real Symmetric Matrices and Quadratic Forms 


There are many real matrices A that are not diagonalizable. In fact, some real matrices may not have any 
(real) eigenvalues. However, if A is a real symmetric matrix, then these problems do not exist. Namely, 
we have the following theorems. 


THEOREM 9.12: — Let A be areal symmetric matrix. Then each root / of its characteristic polynomial is 
real. 


THEOREM 9.13: Let A be a real symmetric matrix. Suppose u and v are eigenvectors of A belonging 
to distinct eigenvalues 2, and /,. Then u and v are orthogonal, that; is, (u, v} = 0. 


The above two theorems give us the following fundamental result. 


THEOREM 9.14: — Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that 
D = P~'AP is diagonal. 


The orthogonal matrix P is obtained by normalizing a basis of orthogonal eigenvectors of A as 
illustrated below. In such a case, we say that A is ‘‘orthogonally diagonalizable.’’ 


EXAMPLE 9.9 Let 4 = E z , a real symmetric matrix. Find an orthogonal matrix P such that P~!AP is 
diagonal. 
First we find the characteristic polynomial A(t) of A. We have 


t(4A)=2+5=7, |A| = 10-4=6; so A(t)=f’—7t+6=(t—6)(t—1) 
Accordingly, 2; = 6 and A, = 1 are the eigenvalues of A. 


(a) Subtracting 2, = 6 down the diagonal of A yields the matrix 


-4 -2 
= =] 


—4x —2y=0 E 
| and the homogeneous system Hp or 2x+y=0 


A nonzero solution is u; = (1, —2). 
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(b) Subtracting 2, = 1 down the diagonal of A yields the matrix 


1 =2 


m=]; 4 


| and the homogeneous system x—2y=0 


(The second equation drops out, because it is a multiple of the first equation.) A nonzero solution is 
Uy = (2,1). 


As expected from Theorem 9.13, u, and u, are orthogonal. Normalizing u, and u, yields the orthonormal vectors 
û = (1/V5,-2/VS) and ty = (2/V'S,1/V5) 


Finally, let P be the matrix whose columns are #, and û,, respectively. Then 


Pa o and a | 


As expected, the diagonal entries of P~'AP are the eigenvalues corresponding to the columns of P. 


The procedure in the above Example 9.9 is formalized in the following algorithm, which finds an 
orthogonal matrix P such that P~'AP is diagonal. 


ALGORITHM 9.2: (Orthogonal Diagonalization Algorithm) The input is a real symmetric matrix A. 


Step 1. Find the characteristic polynomial A(t) of A. 

Step 2. Find the eigenvalues of A, which are the roots of A(t). 

Step 3. For each eigenvalue À of A in Step 2, find an orthogonal basis of its eigenspace. 
Step 4. Normalize all eigenvectors in Step 3, which then forms an orthonormal basis of R”. 


Step 5. Let P be the matrix whose columns are the normalized eigenvectors in Step 4. 


Application to Quadratic Forms 


Let q be a real polynomial in variables x,,x>,...,x, such that every term in q has degree two; that is, 


GMs iced, =) eae + 5 dijXiX;, where ci dj ER 
i i<j 
Then q is called a quadratic form. If there are no cross-product terms x;x; (i.e., all dj; = 0), then q is said 
to be diagonal. 

The above quadratic form q determines a real symmetric matrix A = [a,], where a; =c; and 
4; = ay = 5 dy. Namely, g can be written in the matrix form 


ij =O 
q(X) = X'AX 
where X = [x,,X2,... a is the column vector of the variables. Furthermore, suppose X = PY is a 


linear substitution of the variables. Then substitution in the quadratic form yields 
q(Y) = (PY)'A(PY) = Y"(PTAP)Y 


Thus, P’ AP is the matrix representation of q in the new variables. 

We seek an orthogonal matrix P such that the orthogonal substitution X = PY yields a diagonal 
quadratic form for which P’AP is diagonal. Because P is orthogonal, P’ =P !, and hence, 
P’ AP = P~'AP. The above theory yields such an orthogonal matrix P. 
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EXAMPLE 9.10 Consider the quadratic form 


g(x,y) = 2 — 4xy + 5y? = XT AX, where A= E l and X= H 
By Example 9.9, 
ljyv3 2/45 
P'AP = i 4 = P'AP, where P= NS NS 
ot -3/5 1/v5 


Let Y = |s, q”. Then matrix P corresponds to the following linear orthogonal substitution x = PY of the variables x 
and y in terms of the variables s and t: 


1 2 2 1 
t=- Ss tl, = -——S5+— Ht 
Ea a ae 


This substitution in g(x,y) yields the diagonal quadratic form q(s,f) = 6s? +7’. 


9.7 Minimal Polynomial 


Let A be any square matrix. Let /(A) denote the collection of all polynomials f (t) for which A is a root— 
that is, for which f(A) = 0. The set /(A) is not empty, because the Cayley—Hamilton Theorem 9.1 tells us 
that the characteristic polynomial A,(t) of A belongs to J(A). Let m(t) denote the monic polynomial of 
lowest degree in J(A). (Such a polynomial m(t) exists and is unique.) We call m(t) the minimal 
polynomial of the matrix A. 


Remark: A polynomial f(t) 4 0 is monic if its leading coefficient equals one. 
The following theorem (proved in Problem 9.33) holds. 


THEOREM 9.15: The minimal polynomial m(t) of a matrix (linear operator) A divides every 
polynomial that has A as a zero. In particular, m(t) divides the characteristic 
polynomial A(t) of A. 


There is an even stronger relationship between m(t) and A(t). 
THEOREM 9.16: The characteristic polynomial A(t) and the minimal polynomial m(t) of a matrix A 


have the same irreducible factors. 


This theorem (proved in Problem 9.35) does not say that m(t) = A(t), only that any irreducible factor 
of one must divide the other. In particular, because a linear factor is irreducible, m(t) and A(t) have the 
same linear factors. Hence, they have the same roots. Thus, we have the following theorem. 


THEOREM 9.17: A scalar / is an eigenvalue of the matrix A if and only if å is a root of the minimal 
polynomial of A. 


2 2 -5] 
EXAMPLE 9.11 Find the minimal polynomial m(t) of A= |3 7 —15 
1 2 -4 


First find the characteristic polynomial A(t) of A. We have 


tr(A) =5, Ay, +An +43, = 2-34+8=7, and A| =3 


Hence, 


AG) =f =5? +73 SG = 17 (t=) 
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The minimal polynomial m(t) must divide A(t). Also, each irreducible factor of A(f) (i.e., t — 1 and ¢ — 3) must 
also be a factor of m(t). Thus, m(t) is exactly one of the following: 


2 
f(t)=(t-3)(t-1) or g(t) = (¢-3)(¢- 1) 
We know, by the Cayley—Hamilton theorem, that g(A) = A(A) = 0. Hence, we need only test f (t). We have 


1 2 <4) [= 2 <5 00 0 
f(4)=(4-D(4-3= |3 6 -15 3 4 -15/=|10 0 0 
1 2 -5 1 2 -7 00 0 


Thus, f(t) = m(t) = (t — 1)(t — 3) = — 4t + 3 is the minimal polynomial of A. 


EXAMPLE 9.12 


(a) Consider the following two r-square matrices, where a # 0: 


A 10 0 0 A a0 0 0 
0 1... 0 0 O Aa... 0 0 
OP = || e EN and: || setter Geamitnnmninn 
000... 24 1 0 0 0 A a 
00.0 vse OD aA 00 0 0 A 


The first matrix, called a Jordan Block, has 1’s on the diagonal, 1’s on the superdiagonal (consisting of the 
entries above the diagonal entries), and 0’s elsewhere. The second matrix A has 1’s on the diagonal, a’s on the 
superdiagonal, and 0’s elsewhere. [Thus, A is a generalization of J(A,r).] One can show that 


fi) =(t-Ay 
is both the characteristic and minimal polynomial of both J(A,r) and A. 
(b) Consider an arbitrary monic polynomial: 
f(t) =f +a, yt" | +--+ +at+ ay 


Let C( f) be the n-square matrix with 1’s on the subdiagonal (consisting of the entries below the diagonal 
entries), the negatives of the coefficients in the last column, and 0’s elsewhere as follows: 


0 0... 0 —a 
10... 0 ~-a, 
CYfy=]0 1 ... 0 =a, 
0 0... 1 -anı 


Then C( f) is called the companion matrix of the polynomial f(t). Moreover, the minimal polynomial m(t) and 
the characteristic polynomial A(t) of the companion matrix C( f) are both equal to the original polynomial f (£). 


Minimal Polynomial of a Linear Operator 


The minimal polynomial m(t) of a linear operator T is defined to be the monic polynomial of lowest 
degree for which T is a root. However, for any polynomial f(t), we have 


Fir) =0 if and only if f(A) =0 


where A is any matrix representation of T. Accordingly, T and A have the same minimal polynomials. 
Thus, the above theorems on the minimal polynomial of a matrix also hold for the minimal polynomial of 
a linear operator. That is, we have the following theorems. 


THEOREM 9.15': The minimal polynomial m(t) of a linear operator T divides every polynomial that 
has T as a root. In particular, m(t) divides the characteristic polynomial A(t) of T. 
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THEOREM 9.16': The characteristic and minimal polynomials of a linear operator T have the same 
irreducible factors. 


THEOREM 9.17: A scalar 2 is an eigenvalue of a linear operator T if and only if 2 is a root of the 
minimal polynomial m(t) of T. 


9.8 Characteristic and Minimal Polynomials of Block Matrices 


This section discusses the relationship of the characteristic polynomial and the minimal polynomial to 
certain (square) block matrices. 


Characteristic Polynomial and Block Triangular Matrices 


. A | , where A, and A, are square matrices. Then 
2 


tI — M is also a block triangular matrix, with diagonal blocks tJ — A, and tI — A,. Thus, 


Suppose M is a block triangular matrix, say M = | 


t-A -B 


Em T ee 


[=| = al -4 
That is, the characteristic polynomial of M is the product of the characteristic polynomials of the diagonal 


blocks A, and A). 
By induction, we obtain the following useful result. 


THEOREM 9.18: — Suppose M is a block triangular matrix with diagonal blocks A), A>,...,A,. Then the 
characteristic polynomial of M is the product of the characteristic polynomials of the 
diagonal blocks 4;; that is, 


Am(t) = Ay (t)Ag, (4)... Aa (2) 


9 -1 ! 5 7 
€ 9! 5 —4 

EXAMPLE 9.13 Consider the matrix M = 0 0 ! 3 sl: 
0 0 '-1 8 


Then M is a block triangular matrix with diagonal blocks 4 = ; E and B = E 4 . Here 


=9+3= 12, det(A)=27+8=35, andso A,(t) =f — 12t +35 =(t—5)(t—7) 
34+8=11, det(B) =24+6=30, andso A,(t) =f — 11t +30 = (t— 5)(t— 6) 


Accordingly, the characteristic polynomial of M is the product 


Au(t) = A4) Arlt) = (t — 5) (t — 6)(t— 7) 


Minimal Polynomial and Block Diagonal Matrices 
The following theorem (proved in Problem 9.36) holds. 
THEOREM 9.19: Suppose M is a block diagonal matrix with diagonal blocks A,,A,,...,A,. Then the 


minimal polynomial of M is equal to the least common multiple (LCM) of the 
minimal polynomials of the diagonal blocks 4,. 


Remark: We emphasize that this theorem applies to block diagonal matrices, whereas the 
analogous Theorem 9.18 on characteristic polynomials applies to block triangular matrices. 
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EXAMPLE 9.14 Find the characteristic polynomal A(t) and the minimal polynomial m(t) of the block diagonal 
matrix: 


2 5,0 0,0 
0 2 9 0'0 2 5 42 
M=1|0 0 (4 2,0] =diag(A,,A,,A43), where 4, = „4, = As = [7] 
0 2 3.5 
0.013 510 
0010 017 


Then A(f) is the product of the characterization polynomials A,(t), A(t), A3(t) of A,, 42, 43, respectively. 
One can show that 


A,(t) = (t-2)’, A(t) = (t— 2)(¢— 7), A(t) =t-7 
Thus, A(t) = (t — 2)°(t — 7}. [As expected, deg A(t) = 5.] 
The minimal polynomials m; (t), m(t), m3(t) of the diagonal blocks 4; , 43,43, respectively, are equal to the 
characteristic polynomials; that is, 
m(t) = (t— 2)’, m(t) = (t—2)(t— 7), m3(t)=t—7 


But m(t) is equal to the least common multiple of m,(t),m(t),m3(t). Thus, m(t) = (t — 2)°(t —7). 


SOLVED PROBLEMS 


Polynomials of Matrices, Characteristic Polynomials 


9.1. Let A= || 3): Fina f(A), where 
(a) f@ =f —3t+7, (b) f(A) =P- 6+ 13 
; 2 fl =2][1 -2]_ [-7 -12 
First find A“ = È | i Al E E 1; | Then 


=f =P -3 6 7 0 =3 6 
(a) fa) =A = 34410 y ea er Poca aa J 


v o-er B s g-e 
[Thus, A is a root of f (t).] 


9.2. Find the characteristic polynomial A(t) of each of the following matrices: 


2 > 7 -3 3 -2 
Use the formula (t) = ? — tr(M) t + |M] for a 2 x 2 matrix M: 
(a) t(4)=2+1=3, |4/=2-20=-18, so A(t) =P —3t-18 
(b) t(B)=7-2=5, |BJ=-144+15=1, so A()=P—5t+1 
(c) t(C)=3-3=0, |Cj\=-9+18=9, so A(th=f49 


9.3. Find the characteristic polynomial A(t) of each of the following matrices: 
1 2 3 1 6 -2 
(a) A=]3 0 4|, B=]-3 2 0 
6 4 5 0 3 —4 
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Use the formula A(t) = Ë — tr(A) + (4), + Ao) + 433)t — |A|, where A; is the cofactor of a; in the 
3 x 3 matrix A = [a,j]. 


(a) tr(A) =14+0+5=6, 
0 4 


1 3 1 2 


Ay + 4x +453 =-—35, and |A|=48+36-— 16-— 30 = 38 
Thus, A(t) = P — 6? — 35t — 38 
(b) t(B)=1+2-4=-1 
Bu =|3 s=- Ba=|o 5- -4 E s| =20 
By +B +By3=8, and |B|=-—8+ 18-72 = —62 
Thus, Al) =P +f —8t+ 62 


9.4. Find the characteristic polynomial A(t) of each of the following matrices: 
2 2 


(a) A= (bl) B= 


oo or 
ODoue 


2 5 1 1 
1 4 2 2 
0 0 6 -5 
0 0 2 3 


OU w 
(oS 


(a) A is block triangular with diagonal blocks 
2 5 6 -5 
A, = and = 4, = 
2 3 
Thus, A(t) = Ay (t)A4,(t) = (Ë — 6t + 3)(P — 9t + 28) 


(b) Because B is triangular, A(t) = (t — 1)(t — 3)(t — 5)(t — 6). 


9.5. Find the characteristic polynomial A(t) of each of the following linear operators: 
(a) F:R? — R? defined by F(x,y) = (3x + 5y, 2x —7y). 


(b) D:V — V defined by D(f)=df/dt, where V is the space of functions with basis 
S = {sint, cost}. 


The characteristic polynomial A(t) of a linear operator is equal to the characteristic polynomial of any 
matrix A that represents the linear operator. 


(a) Find the matrix A that represents T relative to the usual basis of R?. We have 


A= E ah so A(t)=f—tr(A)t+|4J=?+4t-31 


(b) Find the matrix A representing the differential operator D relative to the basis S. We have 


D(sint) = cost = O(sint) + 1(cosr) _ {0 -1 
D(cost) = — sin t = —1(sint) + 0(cos £) meee ASi g 
Therefore, A(t) =P -tr(4)t+|A=f+1 


9.6. Show that a matrix A and its transpose A’ have the same characteristic polynomial. 


By the transpose operation, (tI — A)’ = 17 — AT = tI — AT. Because a matrix and its transpose have 
the same determinant, 


A4(t) =| -A| = |- A)" | = | — AT] = Ayr) 
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9.7. Prove Theorem 9.1: Let f and g be polynomials. For any square matrix A and scalar k, 


© (+24) =) +4), GD (Af)(A) =W), 
(ii) (A) =f(A)g(A), (iv) f(A)g(A) = (4Y (4). 
Suppose f = a,f" +: -- + at + ag and g = b„t” +--+ + bit + bg. Then, by definition, 


f(4) =4,A" +- +aA + agl and g(A) = b,A" +- + b14 + bol 


(i) Suppose m < n and let b; = 0 if i > m. Then 


J +g= (a, +b,)f +: + (a, +bi)t+ (a + bo) 
Hence, 
(f + g)(A) = (a, + b,)A" +-+: + (a; + bi)A + (ao + bo) 
=a,A"+b,A" +--+ +a,A4 bA 4+ aol + bol = f(A) + g4) 
n+m 
(ii) By definition, fg = C ymt" t” +--+ cit + cy = >> ctt, where 
k=0 
k 
Cp = Ayby + aibri + +++ + agbo = ) aibi 
i=0 
n+m 
Hence, ( fg)(A) = >> c,A* and 
k=0 
n . m 2 n m eas n+m 
FAA = (Sa!) (Ha!) =F awd = E egal = (fed) 
i= J= i=0 j= = 


(iii) By definition, kf = ka,t” +---+ka,t+ kag, and so 
(Af)(A) = ka, A" + +++ + kayA + kaol = k(a,A" + +++ + aA + aol) = kf (A) 


(iv) By Gi), g(A)f(A) = (gf)(4) = (48) (4) = F(A) 8 (4). 
9.8. Prove the Cayley—Hamilton Theorem 9.2: Every matrix A is a root of its characterstic polynomial 
A(t). 
Let A be an arbitrary n-square matrix and let A(t) be its characteristic polynomial, say, 
A) =| -A= +a, tata 


Now let B(t) denote the classical adjoint of the matrix tI — A. The elements of B(t) are cofactors of the 
matrix J — A and hence are polynomials in ¢ of degree not exceeding n — 1. Thus, 


B(t) = B,_\f | +---+Bit+ Bo 


where the B, are n-square matrices over K which are independent of t. By the fundamental property of the 
classical adjoint (Theorem 8.9), (tI — A)B(t) = |tI — A|J, or 


(tI — A)(B, 1! +++ Byt+ Bo) = (+a, 1t! +++ atta 


Removing the parentheses and equating corresponding powers of ¢ yields 


B, =, B, 9 — AB„,-1 = 4,1l, aod By, — AB, =al, —ABy = al 
Multiplying the above equations by A”, A”~!, ..., A, I, respectively, yields 
FR i =4,I,  A™'B,_ — A”B, = a„,14"!, ..., AB PB, = qA, BB = aI 


Adding the above matrix equations yields 0 on the left-hand side and A(A) on the right-hand side; that is, 
O=A +a, A! +--+ aA + al 


Therefore, A(A) = 0, which is the Cayley—Hamilton theorem. 
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Eigenvalues and Eigenvectors of 2 x 2 Matrices 


3 —4 
9.9. Let A= |> = 


(a) Find all eigenvalues and corresponding eigenvectors. 


(b) Find matrices P and D such that P is nonsingular and D = P7!AP is diagonal. 
(a) First find the characteristic polynomial A(t) of A: 


A(t) = P — tr(A) t+ |A| = Ê + 3t— 10 = (t — 2)(t + 5) 


The roots 2 = 2 and 2 = —5 of A(t) are the eigenvalues of A. We find corresponding eigenvectors. 


(i) Subtract 1 = 2 down the diagonal of A to obtain the matrix M = A — 2I, where the corresponding 
homogeneous system MX = 0 yields the eigenvectors corresponding to 7 = 2. We have 


M = | l E ; corresponding to 


2 _8 or x—4y=0 


The system has only one free variable, and v; = (4, 1) is a nonzero solution. Thus, v, = (4, 1) is 
an eigenvector belonging to (and spanning the eigenspace of) å = 2. 


(ii) Subtract 2 = —5 (or, equivalently, add 5) down the diagonal of A to obtain 


8x —4v =0 


2x— y=0 or 2x-y=0 


M= k E ; corresponding to 


The system has only one free variable, and v, = (1,2) is a nonzero solution. Thus, v, = (1,2) is 
an eigenvector belonging to 2 = 5. 


(b) Let P be the matrix whose columns are v, and v. Then 


{4 1 _ piap |2 0 
ral J and D=P a= 2] 


Note that D is the diagonal matrix whose diagonal entries are the eigenvalues of A corresponding to the 
eigenvectors appearing in P. 


Remark: Here P is the change-of-basis matrix from the usual basis of R? to the basis 


S = {v,, v}, and D is the matrix that represents (the matrix function) A relative to the new basis S. 


2 2 
9.10. Let A= i ‘lk 


(a) Find all eigenvalues and corresponding eigenvectors. 
(b) Find a nonsingular matrix P such that D = P~'AP is diagonal, and P~!. 
(c) Find A® and f(A), where £ — 30 — 6P + 7t +3. 


(d) Find a ‘‘real cube root’’ of B—that is, a matrix B such that B? = A and B has real eigenvalues. 


(a) First find the characteristic polynomial A(t) of A: 


A(t) = P —tr(A) t+ |A| =? —5¢+4 = (t— 1)(t— 4) 


The roots 2 = 1 and 4 = 4 of A(t) are the eigenvalues of A. We find corresponding eigenvectors. 


(i) Subtract 1 = 1 down the diagonal of A to obtain the matrix M = A — JI, where the corresponding 
homogeneous system MX = 0 yields the eigenvectors belonging to 4 = 1. We have 


_ fl 2 . x+2y=0 
M= | | ; corresponding to x+2=0 


12 or x+2y=0 
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The system has only one independent solution; for example, x = 2, y = —1. Thus, v, = (2, —1) is 
an eigenvector belonging to (and spanning the eigenspace of) å = 1. 


(ii) Subtract 2 = 4 down the diagonal of A to obtain 


—2x+2y=0 


eat or x-y=0 


M= E E , corresponding to 
The system has only one independent solution; for example, x = 1, y = 1. Thus, v, = (1,1) is an 
eigenvector belonging to A = 4. 


(b) Let P be the matrix whose columns are v, and v. Then 


_ 2 1 piao | 1 0 
že | and D=P aeh 4 


1 
-1 _ |3 
il | where P -|; 


WIN Lope 
U4 


(c) Using the diagonal factorization A = PDP~', and 1° = 1 and 4° = 4096, we get 


2 1/f1 o] -4 1366 2230 
0 4096||! 2] | 1365 2731 
Also, f(1) = 2 and f(4) = —1. Hence, 


-1 1 
| 2 a) [2 o0 
sa-m 2 J | 


(d) Here F jl is the real cube root of D. Hence the real cube root of A is 
2+ %4 -24+274 


i 2 1ļ|]1 0 —}| 1 
B = PWDP™ = , a 
-1 1/|o W4 5] 3|-14V74 142074 
9.11. Each of the following real matrices defines a linear transformation on R°: 
5 6 1 =i 5 -l 
Find, for each matrix, all eigenvalues and a maximum set S of linearly independent eigenvectors. 


Which of these linear operators are diagonalizable—that is, which can be represented by a 
diagonal matrix? 


Pee ss P" = 


Wi Wire 


(a) First find A(t) = Ê — 3t — 28 = (t — 7)(t + 4). The roots 2 = 7 and 2 = —4 are the eigenvalues of A. 
We find corresponding eigenvectors. 


(i) Subtract 2 = 7 down the diagonal of A to obtain 


—2 6 


M= | 3 E ; corresponding to saa cae 


3x—9%=0 or x—3y=0 


Here vı = (3,1) is a nonzero solution. 
(ii) Subtract 2 = —4 (or add 4) down the diagonal of A to obtain 


9x + by = 0 


9 6 ; 
M= | | ; corresponding to 3x+2y=0 


3 2 or 3x+2y=0 


Here v, = (2, —3) is a nonzero solution. 


Then S = {v,,v,} = {(3, 1), (2, —3)} is a maximal set of linearly independent eigenvectors. Because S is 
a basis of R?, A is diagonalizable. Using the basis S, A is represented by the diagonal matrix D = diag(7, —4). 
(b) First find the characteristic polynomial A(t) = £? + 1. There are no real roots. Thus B, a real matrix 


representing a linear transformation on R’, has no eigenvalues and no eigenvectors. Hence, in particular, 
B is not diagonalizable. 
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(c) First find A(t) = Ê — 8t + 16 = (t — 4)”. Thus, A = 4 is the only eigenvalue of C. Subtract 2 = 4 down 
the diagonal of C to obtain 


1 -l 


“=|; -1 


| A corresponding to x-y=0 

The homogeneous system has only one independent solution; for example, x = 1, y= 1. Thus, 
v= (1,1) is an eigenvector of C. Furthermore, as there are no other eigenvalues, the singleton set 
S = {v} = {(1,1)} is a maximal set of linearly independent eigenvectors of C. Furthermore, because S 
is not a basis of R’, C is not diagonalizable. 


9.12. Suppose the matrix B in Problem 9.11 represents a linear operator on complex space C”. Show 
that, in this case, B is diagonalizable by finding a basis S' of Cc consisting of eigenvectors of B. 


The characteristic polynomial of B is still A(t) = + 1. As a polynomial over C, A(t) does factor; 
specifically, A(t) = (t — i)(t + i). Thus, 2 = i and A = —i are the eigenvalues of B. 
(i) Subtract 2 = i down the diagonal of B to obtain the homogeneous system 


(1 —i)x - y=0 


2x+(-l—-iy=0 = Pea y= 0 


The system has only one independent solution; for example, x = 1, y = 1 — i. Thus, v; = (1, 1 — i) is 
an eigenvector that spans the eigenspace of 4 = i. 


(ii) Subtract 2 = —i (or add i) down the diagonal of B to obtain the homogeneous system 


(1+ i)x — y=0 


siti A UMR 


The system has only one independent solution; for example, x = 1, y = 1 + i. Thus, v, = (1, 1+ i) is 
an eigenvector that spans the eigenspace of 2 = —i. 


As a complex matrix, B is diagonalizable. Specifically, S = {v;1, v2} = {(1,1 — i), (1,1 +i)} is a basis of 
C? consisting of eigenvectors of B. Using this basis S, B is represented by the diagonal matrix 
D = diag(i, —i). 


9.13. Let L be the linear transformation on R° that reflects each point P across the line y = kx, where 
k > 0. (See Fig. 9-1.) 


(a) Show that v; = (A, 1) and v, = (1,—k) are eigenvectors of L. 


(b) Show that L is diagonalizable, and find a diagonal representation D. 


Figure 9-1 


(a) The vector v; = (k, 1) lies on the line y = kx, and hence is left fixed by L; that is, L(v,;) = vı. Thus, v, 
is an eigenvector of L belonging to the eigenvalue 7, = 1. 

The vector v, = (1,—) is perpendicular to the line y = kx, and hence, L reflects v, into its 

negative; that is, L(v.) = — v. Thus, v, is an eigenvector of L belonging to the eigenvalue 1, = —1. 
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(b) Here S = {v,, v} is a basis of R? consisting of eigenvectors of L. Thus, L is diagonalizable, with the 
1 


diagonal representation D = | 0 


3 (relative to the basis S). 


Eigenvalues and Eigenvectors 


4 1 -1 
9.14. Let A= |2 5 -—2].(a) Find all eigenvalues of A. 
1 1 2 


(b) Find a maximum set S of linearly independent eigenvectors of A. 
(c) Is A diagonalizable? If yes, find P such that D = P~'AP is diagonal. 
(a) First find the characteristic polynomial A(t) of A. We have 
tr(A)=44+5+2=11 and |4)=40-2-2454+8-4=45 


Also, find each cofactor A,; of a; in A: 


5 =2 4 -I 4 1 
a= 3|= 12. 4a= lf 229 Ass =|> 3-38 
Hence, A(t) = P — tr(A) Ê + (Ay, + Aap + A33)t — |A| = Ê — 11 + 392 — 45 


Assuming Aż has a rational root, it must be among +1, +3, +5, +9, +15, +45. Testing, by 
synthetic division, we get 


3) 11143945 
324445 
1—-8+15+ 0 


Thus, t = 3 is a root of A(ż). Also, ¢ — 3 is a factor and £ — 8¢ + 15 is a factor. Hence, 


A(t) = (t= 3)(? — 8t + 15) = (t—3)(t — 5)(t — 3) = (t - 3} (t — 5) 
Accordingly, 4 = 3 and 1 = 5 are eigenvalues of A. 
(b) Find linearly independent eigenvectors for each eigenvalue of A. 


(i) Subtract 2 = 3 down the diagonal of A to obtain the matrix 


1 1 -1 
M=]|2 2 -2], corresponding to x+y-z=0 
1 1 —1 


Here u = (1,—1,0) and v = (1,0, 1) are linearly independent solutions. 
(ii) Subtract 2 = 5 down the diagonal of A to obtain the matrix 


-1 1 -1 Sae = ea 
M=| 2 0 -2], corresponding to 2x- 2z=0 or ” be 0 
1 1 -3 x+y—3z=0 — 


Only z is a free variable. Here w = (1,2, 1) is a solution. 


Thus, S = {u,v,w} = {(1,-1,0), (1,0,1), (1,2,1)} is a maximal set of linearly independent 
eigenvectors of A. 


Remark: The vectors u and v were chosen so that they were independent solutions of the system 
x-+y—z=0. On the other hand, w is automatically independent of u and v because w belongs to a 
different eigenvalue of A. Thus, the three vectors are linearly independent. 
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(c) A is diagonalizable, because it has three linearly independent eigenvectors. Let P be the matrix with 
columns u, v, w. Then 


1 1 1 3 
P= |-—1 0 2 and D= P'AP = 3 
1 1 5 
3 -1 1 
9.15. Repeat Problem 9.14 for the matrix B= |7 -—5 1 
6 —6 2 


(a) First find the characteristic polynomial A(t) of B. We have 
tr(B) = 0, |B| = —16, By = —4, By = 0, B33 = —8, SO X Bi = —12 
i 
Therefore, A(t) = Ë — 12t + 16 = (t—2)?(t +4). Thus, 2, = 2 and 2, = —4 are the eigen- 
values of B. 
(b) Find a basis for the eigenspace of each eigenvalue of B. 


(i) Subtract 2, = 2 down the diagonal of B to obtain 
=I 1 x— y+z=0 


—7 II, corresponding to Ix —Ty+z=0 or 
—6 0 6x — 6y =0 


x-y+z=0 
z=0 


M= 


Arner 


The system has only one independent solution; for example, x= 1, y=1, z=0. Thus, 
u = (1,1,0) forms a basis for the eigenspace of 4, = 2. 


(ii) Subtract 2, = —4 (or add 4) down the diagonal of B to obtain 


7 24 1 7x — y+ z=0 = + =0 
M=|7 -1 1l, corresponding to Ix— y+ z=0 OF i a 
eana 6x — 6y + 62 =0 aaa 


The system has only one independent solution; for example, x=0, y=1, z=1. Thus, 
v = (0, 1,1) forms a basis for the eigenspace of 2, = —4. 
Thus S = {u, v} is a maximal set of linearly independent eigenvectors of B. 


(c) Because B has at most two linearly independent eigenvectors, B is not similar to a diagonal matrix; that 
is, B is not diagonalizable. 


9.16. Find the algebraic and geometric multiplicities of the eigenvalue 2, = 2 of the matrix B in 
Problem 9.15. 


The algebraic multiplicity of 2, = 2 is 2, because ¢ — 2 appears with exponent 2 in A(t). However, the 
geometric multiplicity of 4; = 2 is 1, because dim E,, = 1 (where Æ,, is the eigenspace of /,). 


9.17. Let T:R? — R? be defined by T(x,y,z) = (2x +y—2z, 2x+3y—4z, x+y —z). Find all 
eigenvalues of T, and find a basis of each eigenspace. Is T diagonalizable? If so, find the basis S of 
R? that diagonalizes T, and find its diagonal representation D. 


First find the matrix A that represents T relative to the usual basis of R? by writing down the coefficients 
of x,y,z as rows, and then find the characteristic polynomial of A (and T). We have 


2 1 -2 tr(4)=4, |A|=2 
A=|T]=|2 3 —4 and = 411 = 1, 4n =0, 433 =4 
1 1 -1 Ay = 5 


Therefore, A(t) = Ë — 4° + 5t—2 = (t 1)°(t 2), and so A = 1 and J = 2 are the eigenvalues of A (and 
T). We next find linearly independent eigenvectors for each eigenvalue of A. 
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9.18. 


9.19. 


9.20. 


9.21. 


(i) Subtract 2 = 1 down the diagonal of A to obtain the matrix 
1 1 
M=|2 2 -4}], corresponding to x+y—2z=0 
1 1 


Here y and z are free variables, and so there are two linearly independent eigenvectors belonging 
to A = 1. For example, u = (1,—1,0) and v = (2,0, 1) are two such eigenvectors. 


(ii) Subtract 2 = 2 down the diagonal of A to obtain 


0 1 —2 y—2z=0 $ =3 = 0 
M=|2 1 —4],  correspondingto 2x+y—4z2=0 or 77, = 0 
11 -3 x+y-3z=0 E 


Only z is a free variable. Here w = (1,2, 1) is a solution. 


Thus, T is diagonalizable, because it has three independent eigenvectors. Specifically, choosing 
S = {u,v,w} = {(1,—1,0), (2,0,1), (1,2,1)} 

as a basis, T is represented by the diagonal matrix D = diag(1, 1,2). 

Prove the following for a linear operator (matrix) T: 

(a) The scalar 0 is an eigenvalue of T if and only if T is singular. 


(b) If J is an eigenvalue of T, where T is invertible, then 4~' is an eigenvalue of T7!. 


(a) We have that 0 is an eigenvalue of T if and only if there is a vector v Æ 0 such that T(v) = Ov—that is, if 
and only if T is singular. 


(b) Because T is invertible, it is nonsingular; hence, by (a), 2 4 0. By definition of an eigenvalue, there 
exists v Æ 0 such that T(v) = Av. Applying T~! to both sides, we obtain 


v= T! (Av) = AT! (v), andso  T!(v)=4'v 
Therefore, 27! is an eigenvalue of 7~!. 
Let À be an eigenvalue of a linear operator T: V — V, and let E, consists of all the eigenvectors 
belonging to À (called the eigenspace of 4). Prove that E, is a subspace of V. That is, prove 


(a) If u € E,, then ku € E, for any scalar k. (b) If u,v, € Ey, then u + v € £E}. 
(a) Because u € E,, we have T(u) = Au. Then T(ku) = kT (u) = k(Au) = A(ku), and so ku € E}. 
(We view the zero vector 0 € V as an ‘‘eigenvector’’ of 2 in order for E, to be a subspace of V.) 


(b) As u,v € Ej, we have T(u) = du and T(v) = Av. Then 
T(u + v) = T(u)+ T(v) = àu + v = Aut v), andso u+ v €E; 


Prove Theorem 9.6: The following are equivalent: (i) The scalar À is an eigenvalue of A. 
(ii) The matrix AJ — A is singular. 
(iii) The scalar A is a root of the characteristic polynomial A(t) of A. 


The scalar À is an eigenvalue of A if and only if there exists a nonzero vector v such that 
Av = ùv or (Al)hu—Av=0 or (Al— A)v=0 


or AJ — A is singular. In such a case, À is a root of A(t) = |t — A|. Also, v is in the eigenspace EZ, of À if and 
only if the above relations hold. Hence, v is a solution of (AJ — A)X = 0. 


Prove Theorem 9.8’: Suppose v, v>,-..,U, are nonzero eigenvectors of T belonging to distinct 
eigenvalues 2,,A,,...,4,. Then v1, v2,...,v, are linearly independent. 
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Suppose the theorem is not true. Let v4, v2,...,u, be a minimal set of vectors for which the theorem is 
not true. We have s > 1, because v; #0. Also, by the minimality condition, v7,...,v, are linearly 
independent. Thus, v, is a linear combination of v,,..., U, say, 

Vy = AV + 4303 + +++ + as Us (1) 


(where some a, # 0). Applying T to (1) and using the linearity of T yields 
T (v1) = T (az vq + 4303 +::: + asv) = aT (v) + a3T (v3) + +++ +a,T(v;) 2 


Because v, is an eigenvector of T belonging to 4;, we have T (v) = 4,v;. Substituting in (2) yields 


Ay Vy = azu + 031303 +: + ay, ¥, 3 
Multiplying (1) by 2, yields 
Ay Vy = AA, Vy + 34,03 +-+ + aA Us 4 
Setting the right-hand sides of (3) and (4) equal to each other, or subtracting (3) from (4) yields 
alà; — Ax) vp + az(å; — 23)03 +--+ + alh — Au = 0 5 
Because v, v3,..., u, are linearly independent, the coefficients in (5) must all be zero. That is, 
ay(A, — 4) = 9, a3(A, — 23) = 0, ae a,(4, —A,) =0 


However, the 4; are distinct. Hence 4; — 1,40 for j> 1. Hence, a, =0, a; =0,...,a,=0. This 


contradicts the fact that some a, # 0. The theorem is proved. 


9.22. Prove Theorem 9.9. Suppose A(t) = (t — a,)(t— a2)...(t—a,) is the characteristic polynomial 
of an m-square matrix A, and suppose the n roots a; are distinct. Then A is similar to the diagonal 
matrix D = diag(a,,day,...,d,). 

Let v,, v2,..., U, be (nonzero) eigenvectors corresponding to the eigenvalues a;. Then the n eigenvectors 


v; are linearly independent (Theorem 9.8), and hence form a basis of K”. Accordingly, A is diagonalizable 
(i.e., A is similar to a diagonal matrix D), and the diagonal elements of D are the eigenvalues a;. 


9.23. Prove Theorem 9.10’: The geometric multiplicity of an eigenvalue 1 of T does not exceed its 
algebraic multiplicity. 


Suppose the geometric multiplicity of 2 is r. Then its eigenspace E, contains r linearly independent 


eigenvectors v,,...,v,. Extend the set {v;} to a basis of V, say, {v;,...,U,,;W,,---,Ws}. We have 
T(v,) = àv, T(V) = Av, ee T(v,) = Av,, 
T(w,) = ayy +: ay, 0, + byw, 4 bisWs 
T(w2) = ag +: Ay, V,, + baw 4 by.Ws 
T(w,) = Ay Vy T As, Uy bw, y H Dys Ws 


M, A 
0 B 


Because M is block diagonal, the characteristic polynomial (t — 4)” of the block AJ, must divide the 
characteristic polynomial of M and hence of T. Thus, the algebraic multiplicity of 2 for T is at least r, as 
required. 


Then M = | | is the matrix of T in the above basis, where A = [a;l and B = lb]. 


Diagonalizing Real Symmetric Matrices and Quadratic Forms 


i i . Find an orthogonal matrix P such that D = P~'AP is diagonal. 


9.24. Let A = p 4 
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First find the characteristic polynomial A(t) of A. We have 
A(t) =Ë — tr(A) t+ |A| = Ë — 6t — 16 = (t — 8)(t + 2) 


Thus, the eigenvalues of A are 2 = 8 and 4 = —2. We next find corresponding eigenvectors. 
Subtract 2 = 8 down the diagonal of A to obtain the matrix 
|=] 3 F —x+3y=0 i 
M= | 3 sh corresponding to 3x— 9p = 0 or x—3y=0 


A nonzero solution is u = (3, 1). 
Subtract 2 = —2 (or add 2) down the diagonal of A to obtain the matrix 


9 3 
3 1 


9x + 3y = 0 


3x+ y=0 or 3x+y=0 


M = | | ; corresponding to 


A nonzero solution is u = (1, —3). 
As expected, because A is symmetric, the eigenvectors u, and u, are orthogonal. Normalize u; and u, to 
obtain, respectively, the unit vectors 


fi, = (3/V10,1/V10) and ú, = (1/10, —3/V/10). 


Finally, let P be the matrix whose columns are the unit vectors #, and û,, respectively. Then 


3/V10  1/v10 
1/V10  —3/V/10 


As expected, the diagonal entries in D are the eigenvalues of A. 


ett E: 0 
and D=P aal E 


11 -8 4 
9.25. Let B= | -—8 —1 -—2].(a) Find all eigenvalues of B. 
4 —2 -4 


(b) Find a maximal set S of nonzero orthogonal eigenvectors of B. 


(c) Find an orthogonal matrix P such that D = P~'BP is diagonal. 
(a) First find the characteristic polynomial of B. We have 
tr(B) = 6, |B| = 400, By, =0, By = —60, B33 = —75, so By =-135 


Hence, A(t) = P — 6? — 135t — 400. If A(¢) has an integer root it must divide 400. Testing t = —5, by 
synthetic division, yields 
5| 1— 6-— 135 — 400 
— 5+ 55+ 400 
1=11= 80+ 0 
Thus, t+ 5 is a factor of A(t), and f° — 11¢— 80 is a factor. Thus, 


A(t) = (t+ 5)(Ê — 11t — 80) = (t + 5) (t — 16) 


The eigenvalues of B are 2 = —5 (multiplicity 2), and 2 = 16 (multiplicity 1). 


(b) Find an orthogonal basis for each eigenspace. Subtract 2 = —5 (or, add 5) down the diagonal of B to 
obtain the homogeneous system 


16x — 8y + 4z = 0, —8x + 4y — 2z = 0, 4x—2y+z=0 


That is, 4x — 2y + z = 0. The system has two independent solutions. One solution is v, = (0, 1,2). We 
seek a second solution v, = (a,b,c), which is orthogonal to v, such that 


4a—2b+c=0, and also b—2c=0 
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One such solution is v, = (—5, —8, 4). 
Subtract 2 = 16 down the diagonal of B to obtain the homogeneous system 


—5x — 8y + 4z = 0, —8x — 17y — 2z = 0, 4x — 2y — 20z = 0 


This system yields a nonzero solution v; = (4,—2,1). (As expected from Theorem 9.13, the 
eigenvector v3 is orthogonal to v; and v.) 


Then v, v2, v3 form a maximal set of nonzero orthogonal eigenvectors of B. 
(c) Normalize v, u2, v, to obtain the orthonormal basis: 


ô = v /V5, ® = v/v 105, 0, = 3/V21 


Then P is the matrix whose columns are @,, ĉ, ô. Thus, 


0 5/4105 4/V21 -5 
P= | 1/v5 -8//105 —2/v21 and D= P`'BP = -5 
2//5 4/105 1/V21 16 


9.26. Let g(x,y) = x? + 6xy — 7y. Find an orthogonal substitution that diagonalizes q. 


Find the symmetric matrix A that represents g and its characteristic polynomial A(t). We have 
1 3 2 
A= 3 7 and A(t) =f + 6t— 16 = (t— 2)(t+ 8) 


The eigenvalues of A are 1 = 2 and A = —8. Thus, using s and ź as new variables, a diagonal form of q is 
qls, t) = 2s? — 8? 


The corresponding orthogonal substitution is obtained by finding an orthogonal set of eigenvectors of A. 


(i) Subtract 2 = 2 down the diagonal of A to obtain the matrix 


—1 3 


a= | 3 E , corresponding to —x+3y=0 


3x — 9y = 0 o SAREN 


A nonzero solution is u; = (3,1). 


(ii) Subtract 2 = —8 (or add 8) down the diagonal of A to obtain the matrix 


9x + 3y = 0 


9 3 : 
M= | | corresponding to 3x+ y=0 


3 1l or 3x+y=0 


A nonzero solution is u, = (—1,3). 
As expected, because A is symmetric, the eigenvectors u; and u, are orthogonal. 
Now normalize uw, and u to obtain, respectively, the unit vectors 


fi, = (3/V10, 1/V10) and û, = (—1/V/10, 3/10). 


Finally, let P be the matrix whose columns are the unit vectors û; and i, respectively, and then 


[x,y]’ = Pls, f” is the required orthogonal change of coordinates. That is, 
3/V10 —1//10 3s—t st+3t 
~ |1/VI0 ie a fg? Y= ag 
One can also express s and ¢ in terms of x and y by using P~! = P’. That is, 
_ 3x+y eo + 3t 


vio ” vio 
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Minimal Polynomial 


4 -2 2 3 —2 2 
9.27. Let A= |6 —3 4] andB= |4 —4 6|.The characteristic polynomial of both matrices is 
3 -2 3 2 -3 5 


A(t) = (t — 2)(t — 1)’. Find the minimal polynomial m(t) of each matrix. 


The minimal polynomial m(t) must divide A(t). Also, each factor of A(f) (i.e., f— 2 and ¢— 1) must 
also be a factor of m(t). Thus, m(t) must be exactly one of the following: 


FQ=(t-2HE-Y æ gà = (¢- 2) 
(a) By the Cayley—Hamilton theorem, g(4) = A(A) = 0, so we need only test f(t). We have 
2 —-2 2||3 -2 2 0 0 0 
f(A) =(A-2/)(A-I=}6 -5 4||6 -4 4} =]0 0 0 
3 —2 1]/|3 -2 2 0 0 0 


Thus, m(t) = f(t) = (t — 2)(t — 1) =? — 3t + 2 is the minimal polynomial of A. 
(b) Again g(B) = A(B) = 0, so we need only test f(t). We get 


i o> 4 22 2 
f(B) =(B-2D(B-N=|4 -6 6}|4 -5 6}=|-4 4 —4| 40 
2 -3 3||2 -3 4 22 -2 
Thus, m(t) 4 f(t). Accordingly, m(t) = g(t) = (t — 2)(t— 1)? is the minimal polynomial of B. [We 


emphasize that we do not need to compute g(B); we know g(B) = 0 from the Cayley—Hamilton theorem.] 


9.28. Find the minimal polynomial m(t) of each of the following matrices: 


12 3 
5 1 4 -1 
(a) a=] Lo B=|0 2 3|, © c=| | 
2 9 TE 1 2 


(a) The characteristic polynomial of A is A(t) = £ — 12t + 32 = (t — 4)(t — 8). Because A(ż¢) has distinct 
factors, the minimal polynomial m(t) = A(t) = Ê — 12t + 32. 
(b) Because B is triangular, its eigenvalues are the diagonal elements 1,2,3; and so its characteristic 
polynomial is A(t) = (t — 1)(t — 2)(t — 3). Because A(t) has distinct factors, m(t) = A(t). 
(c) The characteristic polynomial of C is A(t) = Ë — 6t +9 = (t — 3). Hence the minimal polynomial of C 
is f(t) = t — 3 or g(t) = (t — 3). However, f (C) Æ 0; that is, C — 37 40. Hence, 
2 
m(t) = g(t) = A(t) = (t= 3)’. 


9.29. Suppose S = {u,,u,...,u,,} is a basis of V, and suppose F and G are linear operators on V such 
that [F] has 0’s on and below the diagonal, and [G] has a # 0 on the superdiagonal and 0’s 
elsewhere. That is, 


0 ay, a3 an1 0 a 0 0 
0 0 ay ay? 0 0 a 0 
E E ; [G] = þann 
0 0 0 ann-1 00 0 a 
0 0 0 0 00 0 0 


Show that (a) F” = 0, (b) G”! Æ 0, but G” = 0. (These conditions also hold for [F] and [G].) 


(a) We have F(u) = 0 and, for r > 1, F(u,) is a linear combination of vectors preceding u, in S. That is, 


F(u,) = Anu as a,2u2 eae A, r—1ur—1 


CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors — Gil 


9.30. 


9.31. 


9.32. 


9.33. 


Hence, F?(u,) = F(F(u,)) is a linear combination of vectors preceding u,_,, and so on. Hence, 
F'(u,) = 0 for each r. Thus, for each r, F” (u,) = F”-"(0) = 0, and so F” = 0, as claimed. 

(b) We have G(u,) = 0 and, for each k > 1, G(u,) = ai,_,. Hence, G” (up) = aug, forr < k. Because a 4 0, 
a"! Æ 0. Therefore, G”! (u) = a"~'u, 4 0, and so G”! Æ 0. On the other hand, by (a), G” = 0. 


Let B be the matrix in Example 9.12(a) that has 1’s on the diagonal, a’s on the superdiagonal, 
where a Æ 0, and 0’s elsewhere. Show that f(t) = (t — A)" is both the characteristic polynomial 
A(t) and the minimum polynomial m(t) of A. 


(t — 4)" is its characteristic polynomial. 


Because A is triangular with J’s on the diagonal, A(t) = f (t) = 
"40. Hence, m(t) = A(t) = (t — 2)". 


Thus, m(t) is a power of t — 2. By Problem 9.29, (4 — A1)" 


Find the characteristic polynomial A(t) and minimal polynomial m(t) of each matrix: 
4 10 0 0 


04100 2 
(a) M=|0 0 4 0 0|, M= 
00 11 
00041 TEF 
00004 


(a) M is block diagonal with diagonal blocks 


4 1 0 
A=|0 4 1 and B= k 3l 
00 4 
The characteristic and minimal polynomial of A is f(t) = (t — 4) and the characteristic and minimal 
polynomial of B is g(t) = (t — 4)’. Then 


A(t) =f (Ðg) =(¢-4)° but m(t) = LCM[ S(t), g(4)] = (t- 4)" 


(where LCM means least common multiple). We emphasize that the exponent in m(t) is the size of the 
largest block. 


0 2 =2 4 


acteristic and minimal polynomial of A’ is f(t) = (t— 2). The characteristic polynomial of B’ is 
g(t) = Ë — 5t + 6 = (t — 2)(t — 3), which has distinct factors. Hence, g(t) is also the minimal polynomial 
of B. Accordingly, 


A(t) =f(t)g(t) = (0-2-3) butt) = LCM A(O, gD] = (t- 2) t- 3) 


(b) Here M’ is block diagonal with diagonal blocks A’ = i d and B' = | l A The char- 


Find a matrix A whose minimal polynomial is f (t) = Ë — 8¢ + 5t + 7. 
0 0 -7 
Simply let A= |1 0 —5 |, the companion matrix of f (t) [defined in Example 9.12(b)]. 
0 1 8 


Prove Theorem 9.15: The minimal polynomial m(t) of a matrix (linear operator) A divides every 
polynomial that has A as a zero. In particular (by the Cayley—Hamilton theorem), m(t) divides the 
characteristic polynomial A(t) of A. 


Suppose f(t) is a polynomial for which f(A) = 0. By the division algorithm, there exist polynomials 
q(t) and r(t) for which f(t) = m(t)g(t) + r(t) and r(t) = 0 or deg r(t) < deg m(t). Substituting t = A in this 
equation, and using that f(A) = 0 and m(A) = 0, we obtain r(A) = 0. If r(t) Æ 0, then r(t) is a polynomial 
of degree less than m(t) that has A as a zero. This contradicts the definition of the minimal polynomial. Thus, 
r(t) = 0, and so f(t) = m(t)q(t); that is, m(t) divides f(t). 
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9.34. 


9.35. 


9.36. 


9.37. 


Let m(t) be the minimal polynomial of an n-square matrix A. Prove that the characteristic 


n 


polynomial A(t) of A divides [m/(t)]”. 


Suppose m(t) = f + ct! +--+ +c,_1t +c, Define matrices B, as follows: 
Bo =F] SO [= Bo 
By, =At+eql So cI =B,-A=B, —ABy 
B, =A +¢,A+ ey] so cI = B, — A(A + cI) = B, — AB, 
B, 54A 4 AP -Hea so c,_,1 = B,_, — AB,_» 
Then 
=AB, 4 =c, = (4 + eA" 5 He, 14 +c) = cI — m(A) = cI 
Set B(t) = Aye 8 +--+ tB, +B, 
Then 
(tI — A)B(t) = (By + 1B, +--+ tB,_) — (€~'|4By + &-7AB, +- + AB,1) 
= t'By + t7! (B, — ABy) + f° (B, — AB,) +-+: + t(B,—1 — AB,_2) — AB, 
=I I+ ot IH- e, tI + cI = mA 


Taking the determinant of both sides gives |4 — A||B(t)| = |m(¢)Z| = [m(t)]”. Because |B(t)| is a poly- 
nomial, |¢/ — A| divides [m/(t)]”; that is, the characteristic polynomial of A divides [m(t)]”. 


Prove Theorem 9.16: The characteristic polynomial A(t) and the minimal polynomial m(t) of A 
have the same irreducible factors. 


Suppose f (t) is an irreducible polynomial. If f(t) divides m(t), then f(t) also divides A(t) [because m(t) 
divides A(t)]. On the other hand, if f(t) divides A(t), then by Problem 9.34, f(t) also divides [m(r)]”. But f(t) 
is irreducible; hence, f(t) also divides m(t). Thus, m(t) and A(t) have the same irreducible factors. 


Prove Theorem 9.19: The minimal polynomial m(t) of a block diagonal matrix M with diagonal 
blocks A; is equal to the least common multiple (LCM) of the minimal polynomials of the 
diagonal blocks 4;. 


We prove the theorem for the case r = 2. The general theorem follows easily by induction. Suppose 


M = E 4 , where A and B are square matrices. We need to show that the minimal polynomial m(t) of M 
is the LCM of the minimal polynomials g(t) and A(t) of A and B, respectively. 


m(A) 0 
0 = m(B) 
(B) = 0. Because g(f) is the minimal polynomial of A, g(t) divides m(t). Similarly, h(t) divides m(t). Thus 
(t) is a multiple of g(t) and A(t). 
Now let f(t) be another multiple of g(t) and A(t). Then f(M) = i? 1 | = k o 
' : 0 f(B) 0 0 
m(t) is the minimal polynomial of M; hence, m(t) divides f(t). Thus, m(t) is the LCM of g(t) and A(t). 


Because m(t) is the minimal polynomial of M,m(M) = | | = 0, and m(A) = 0 and 


m 
m 


| = 0. But 


Suppose m(t) = f +a,_,f’~' + -- -+ aıt + ag is the minimal polynomial of an n-square matrix A. 
Prove the following: 

(a) A is nonsingular if and only if the constant term a, # 0. 

(b) If A is nonsingular, then 4~! is a polynomial in A of degree r — 1 < n. 


(a) The following are equivalent: (i) A is nonsingular, (ii) 0 is not a root of m(t), (iii) ag 4 0. Thus, the 
statement is true. 
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(b) Because A is nonsingular, ay 4 0 by (a). We have 
m(A) = A’ +a, 14! +---+a,A+ aol =0 


i 
Thus, —— (A47! +a, 14 poe ST 
ao 
l evs 
Accordingly, A! =- (A! +a, 4? aI) 
ao 


SUPPLEMENTARY PROBLEMS 


Polynomials of Matrices 


9.38. 


9.39. 


9.40. 


9.41. 


9.42. 


9.43. 


9.44. 


Let A= E =| and B= i l Find FO, s J0, 2(B), where FO =2 —5¢+6 and 


5 1 0 3 
el =P -2P +t++3. 
Let A = F Ah Find A?, 43, A”, where n > 3, and A7!. 
8 12 0 
LetB=|]0 8 12|. Find a real matrix A such that B = A?. 
0 0 8 
For each matrix, find a polynomial having the following matrix as a root: 


112 
2 5 2-3 

(a) a= 3). s= zho C=|1 2 3 
1 -3 T =f a a 


Let A be any square matrix and let f (t) be any polynomial. Prove (a) (P~!4P)" = P-!A"P. 
(b) f(P-'AP) =P'f(A)P. (c) f(A") =[f(A)]’. (d) If A is symmetric, then f(A) is symmetric. 


Let M = diag[A,,...,A,] be a block diagonal matrix, and let f(t) be any polynomial. Show that f(M) is 
block diagonal and f(M) = diag[ f(A,),...,f(A,)]. 


Let M be a block triangular matrix with diagonal blocks 4,,...,A,, and let f(t) be any polynomial. Show 
that f(M) is also a block triangular matrix, with diagonal blocks f(4,),...,f(A,). 


Eigenvalues and Eigenvectors 


9.45. 


9.46. 


For each of the following matrices, find all eigenvalues and corresponding linearly independent eigen- 
vectors: 


2 -3 2 4 1 4 
() ia 50 E slo gel E 


When possible, find the nonsingular matrix P that diagonalizes the matrix. 


2 -1 
Let A=|_3 ik 


(a) Find eigenvalues and corresponding eigenvectors. 

(b) Find a nonsingular matrix P such that D = P~!AP is diagonal. 
(c) Find AË and f(A) where f(t) = f — 5P + 7P — 2t + 5. 

(d) Find a matrix B such that B? = A. 
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9.49. 


9.50. 


9.51. 


9.52. 


9.53. 


9.54. 


9.55. 


—2 _=2 


. Repeat Problem 9.46 for A = | > | ‘ 


. For each of the following matrices, find all eigenvalues and a maximum set S of linearly independent 
eigenvectors: 
1 -—3 3 3 -1 1 1 2 
(a) A=]3 -5 3],(b) B=]|7 -5 1],() C= 1 2 -1 
6 -6 4 6 —6 2 -1 1 


Which matrices can be diagonalized, and why? 
For each of the following linear operators T: R? — R’, find all eigenvalues and a basis for each eigenspace: 
(a) T(x,y) = (3x+3y, x+5y), (0) T(x, y) = (3x — 13y, x — 3y). 


b 
d 
diagonalizable—that is, so that A has two (real) linearly independent eigenvectors. 


a ; i ; yi ; 
Let A = k | be a real matrix. Find necessary and sufficient conditions on a,b,c,d so that A is 


Show that matrices 4 and A7 have the same eigenvalues. Give an example of a 2 x 2 matrix A where A and 
A’ have different eigenvectors. 


Suppose v is an eigenvector of linear operators F and G. Show that v is also an eigenvector of the linear 
operator AF + k’G, where k and k’ are scalars. 
Suppose v is an eigenvector of a linear operator T belonging to the eigenvalue 1. Prove 


(a) For n> 0,v is an eigenvector of T” belonging to 2”. 
(b) f(A) is an eigenvalue of f(T) for any polynomial f(t). 


Suppose 2 ¥ 0 is an eigenvalue of the composition F o G of linear operators F and G. Show that / is also an 
eigenvalue of the composition G o F. [Hint: Show that G(v) is an eigenvector of Go F.] 


Let E: V — V be a projection mapping; that is, E? = E. Show that E is diagonalizable and, in fact, can be 


represented by the diagonal matrix M = E i , where r is the rank of E. 


Diagonalizing Real Symmetric Matrices and Quadratic Forms 


9.56. 


9.57. 


9.58. 


9.59. 


For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such 
that D = P-!AP: 


5 4 4 -1 7 3 
(a) asl, ak (b) a i (©) a=; a 


For each of the following symmetric matrices B, find its eigenvalues, a maximal orthogonal set S of 
eigenvectors, and an orthogonal matrix P such that D = P~'BP is diagonal: 


011 22 4 
(a) B=|1 0 1|, &) B=|2 5 8 
110 4 8 17 


Using variables s and ¢, find an orthogonal substitution that diagonalizes each of the following quadratic 
forms: 


(a) q(x, y) = 4x2 + 8xy — 113°, (b) q(x, y) = 2x? — Oxy + 10y? 


For each of the following quadratic forms q(x, y, z), find an orthogonal substitution expressing x, y, z in terms 
of variables r,s, t, and find q(r,s, t): 


(a) q(x,y,z) = 5x? + 3y? + 12xz, (b) q(x,y,z) = 3x? — 4xy + 6y? + 2xz — 4yz + 32? 
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9.60. 


Find a real 2 x 2 symmetric matrix A with eigenvalues: 
(a) A=1 and À = 4 and eigenvector u = (1,1) belonging to 1 = 1; 
(b) A=2 and A = 3 and eigenvector u = (1,2) belonging to 2 = 2. 


In each case, find a matrix B for which B? = A. 


Characteristic and Minimal Polynomials 


9.61. 


9.62. 


9.63. 


9.64. 


9.65. 


9.66. 


9.67. 


9.68. 


9.69. 


Find the characteristic and minimal polynomials of each of the following matrices: 


3 1 -l 3 > =] 
(a) 4=] 2 4 -2],0) B=|3 8 -3 
=| =f 3 3 6 -l 


Find the characteristic and minimal polynomials of each of the following matrices: 


2 5000 4 -1 0 0 0 32000 
020 0 0 1 20 0 0 140 0 0 
(a) A=]0 0 4 2 OF, () B=]0 0 3 1 Of, () C=]0 03 1 0 
003 5 0 0 00 3 1 00 1 3 0 
000 0 7 0 0 0 0 3 000 0 4 
1 1 0 2 0 0 
Le A=|0 2 0} andB= |0 2 2). Show that A and B have different characteristic polynomials 
0 0 1 0 0 1 


(and so are not similar) but have the same minimal polynomial. Thus, nonsimilar matrices may have the 
same minimal polynomial. 


Let A be an n-square matrix for which A* = 0 for some k > n. Show that A” = 0. 
Show that a matrix A and its transpose A’ have the same minimal polynomial. 


Suppose f(t) is an irreducible monic polynomial for which f(A) = 0 for a matrix A. Show that f(t) is the 
minimal polynomial of A. 


Show that A is a scalar matrix kI if and only if the minimal polynomial of A is m(t) = t — k. 
Find a matrix A whose minimal polynomial is (a) Ë — 5? + 6f+ 8, (b) f — 5P — 2¢+7t+4. 


Let f (t) and g(t) be monic polynomials (leading coefficient one) of minimal degree for which A is a root. 
Show f(t) = g(t). [Thus, the minimal polynomial of A is unique.] 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M=|R,; R; ...] denotes a matrix M with rows R,,R),.... 


9.38. f(A) = [-26,-3; 5,-27],  ¢(A) =[—40,39; —65,—27], 


f(B)=B,6; 0,9], g(B) =[3,12; 0,15] 


9.39. 47=[1,4; 0,1], 4?=[1,6; 0,1], A"=[1,2n; 0,1], A-t=[1,-2; 0,1] 


9.40. Let A= [2,a,b; 0,2,c; 0,0,2]. Set B= £ and thena=1,b=—1,c=1 
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9.45. 


9.46. 


9.47. 


9.48. 


9.49. 


9.50. 


9.51. 


9.56. 


9.57. 


9.58. 


9.59. 


9.60. 


9.61. 


9.62. 


9.68. 


9.69. 


. Find A(t): (a) Ë +t-11, (b+) ĉ+2t+13, (© P-7P+6t-1 

(a) A=1,u=(3,1); A= —4,v= (1,2), (b) 2=4,u= (2,1), 

(c) A=-l,u=(2,1); A= —5,v= (2,3). Only A and C can be diagonalized; use P = [u, o]. 

(a) A=1,u=(1,1); 4=4,0=(1,-2), 

(b) P= [u,v], 

(c) f(A)=[3,1; 2,1], 48 = [21 846,—21 845; —43 690,43 691], 

@ B=- -gi 

(a) A=1,u=(3,-2); A=2,v=(2,—1) (b) P= [u, vl, 

(c) f(A) =(2,-6; 2,9], A8 =[1021, 1530; —510,—764], 

(d) B=[-3+4V2, -64+6/2; 2-2/2, 4-3¥V2] 

(a) A=—-2,u= (1,1,0), v= (1,0,-1);4 =4,w = (1, 1,2), 

b) A=2,u=(1,1,0); A=—4,v=(0,1,1), 

(c) A=3,u=(1,1,0),v=(1,0,1); 2A=1,w= (2,-1,1). Only A and C can be diagonalized; use 
P = |u, v, w]. 

(a) A=2,u=(3,-1); A=6,v=(1,1), (b) No real eigenvalues 

We need [—tr(A)]” — 4{det(A)] > 0 or (a — d}? + 4bc > 0. 

A=[l,1l; 0,1] 

(a) P=2B=k 1,2]/V5,  D=[7,0; 0,3], 

(b) P=[1,1; 1,—1]/V2, D=[3,0; 0,5), 

(c) P=[3,-1; 1,3]//10, D=[8,0; 0,2] 

(a) A=-1, u=(1,-1,0), v=(1,1,-2); 4=2, w=(1,1,)), 

(b) 2=1, u=(2,1,—1), v=(2,-3,1); 4=22, w=(1,2,4); 
Normalize u, v, w, obtaining ù, ô, vw, and set P = fû, ô, w]. (Remark: u and v are not unique.) 

(a) x= (4s+2)/V17, y = (=s + 4t)/ v17, qls, t) = 5° — 12P, 

(b) x= (3s — t) / v10, y = (s + 3t)/ v10, q(s,t)=s? +11? 

(a) x= (3s + 2f)/ v13, y =r, z = (2s — 3t)/ V13, q(r,s, t) = 377 + 9s? — 4°, 

b) x=5Ks+Lt,  y=Jr+2Ks-—2Lt, z=2Jr—Ks—Lt, where J=1/v5, K= 1/v30, 
L=1/¥V6; q(r, 8, t) = 2? + 2s? + 87 

(a) A=315, 3; 3,5); B =}[ß,-1; —1,3], 

(b) A=2[14,-2; -2,11], B=2[V2+4V3,2V2—2V3; 2V2 —2V3,4V24 v3] 

(a) A(t) =m(t) =(t-2)(t-6), ©) A() = (t= 2)" 0-6), mA = (t2) 6) 

@) A®=(¢=27G=77, md)=¢-270=7, 

b) A(t)=(t—3)°, = m(t) = (t-3), 

© A=- -A S), mA = (F-2)(¢- 4) (4-5) 


Let A be the companion matrix [Example 9.12(b)] with last column: (a) [—8, —6, 5)”, (b) [-4,-7,2, 5)’ 


Hint: A is a root of h(t) = f(t) — g(t), where A(t) = 0 or the degree of A(t) is less than the degree of f(t). 


CHAPTER 10 


Canonical Forms 


10.1 Introduction 


Let T be a linear operator on a vector space of finite dimension. As seen in Chapter 6, T may not have a 
diagonal matrix representation. However, it is still possible to ‘‘simplify’’ the matrix representation of T 
in a number of ways. This is the main topic of this chapter. In particular, we obtain the primary 
decomposition theorem, and the triangular, Jordan, and rational canonical forms. 

We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic 
polynomial A(t) of T has all its roots in the base field K. This is always true if K is the complex field C 
but may not be true if K is the real field R. 

We also introduce the idea of a quotient space. This is a very powerful tool, and it will be used in the 
proof of the existence of the triangular and rational canonical forms. 


10.2 Triangular Form 


Let T be a linear operator on an n-dimensional vector space V. Suppose 7 can be represented by the 
triangular matrix 


a, 42 din 
A= an2 An 
a 


nn 
Then the characteristic polynomial A(t) of T is a product of linear factors; that is, 
A(t) = det(tl — A) = (t — a11) (t — a22) +++ (t — ann) 


The converse is also true and is an important theorem (proved in Problem 10.28). 


THEOREM 10.1: Let 7:V — V be a linear operator whose characteristic polynomial factors into 
linear polynomials. Then there exists a basis of V in which T is represented by a 
triangular matrix. 


THEOREM 10.1: (Alternative Form) Let Æ be a square matrix whose characteristic polynomial 
factors into linear polynomials. Then A is similar to a triangular matrix—that is, 
there exists an invertible matrix P such that P-'AP is triangular. 


We say that an operator T can be brought into triangular form if it can be represented by a triangular 
matrix. Note that in this case, the eigenvalues of 7 are precisely those entries appearing on the main 
diagonal. We give an application of this remark. 
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EXAMPLE 10.1 Let A be a square matrix over the complex field C. Suppose / is an eigenvalue of A?. Show that 
V2 or — v4 is an eigenvalue of A. 
By Theorem 10.1, A and A? are similar, respectively, to triangular matrices of the form 


x * 2 * * 
L oe My nee 
* 2 * 
B= Lo Ferrey and B= HW PE 
2 
Hn Un 
Because similar matrices have the same eigenvalues, A = u? for some i. Hence, u; = Vi or Ui = —vV⁄ is an 


eigenvalue of A. 


10.3 Invariance 


Let T:V — V be linear. A subspace W of V is said to be invariant under T or T-invariant if T maps W 
into itself—that is, if v € W implies T(v) € W. In this case, T restricted to W defines a linear operator on 
W, that is, T induces a linear operator T:W — W defined by T(w) = T(w) for every w € W. 


EXAMPLE 10.2 


(a) Let T:R? — R? be the following linear operator, which rotates each vector v about the z-axis by an angle 0 
(shown in Fig. 10-1): 


T(x,y,z) = (xcos 0 — ysin ð, xsin +ycosð, z) 


T(v) 


1 
rfi 
7 |i 
| \ 
iol i 
js 
1 
1 
$ 
i 
1 1 
ta 
» 
=v 


Figure 10-1 


Observe that each vector w = (a,b,0) in the xy-plane W remains in W under the mapping T; hence, W is 
T-invariant. Observe also that the z-axis U is invariant under T. Furthermore, the restriction of T to W rotates 
each vector about the origin O, and the restriction of T to U is the identity mapping of U. 


(b) Nonzero eigenvectors of a linear operator T:V — V may be characterized as generators of T-invariant 
one-dimensional subspaces. Suppose T(v)= Av, v#0. Then W = {kv, k€ K}, the one-dimensional 
subspace generated by v, is invariant under T because 


T(kv) = kT (v) =k(Av) = kiv €e W 


Conversely, suppose dim U = 1 and u # 0 spans U, and U is invariant under T. Then T(u) € U and so T(u) is a 
multiple of u—that is, T(u) = uu. Hence, u is an eigenvector of T. 


The next theorem (proved in Problem 10.3) gives us an important class of invariant subspaces. 


THEOREM 10.2: Let T:V — V be any linear operator, and let f(t) be any polynomial. Then the 
kernel of f(T) is invariant under T. 


The notion of invariance is related to matrix representations (Problem 10.5) as follows. 


THEOREM 10.3: Suppose W is an invariant subspace of T:V — V. Then T has a block matrix repre- 


sentation , where A is a matrix representation of the restriction Î of T to W. 


A B 
0 C 
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10.4 Invariant Direct-Sum Decompositions 


A vector space V is termed the direct sum of subspaces W,,..., W,., written 
V=W,OW,8...8W, 

if every vector v € V can be written uniquely in the form 
v= w +w +... +W, with w; € W; 

The following theorem (proved in Problem 10.7) holds. 


THEOREM 10.4: Suppose W,,W,,..., W, are subspaces of V, and suppose 
By = {W11 W123 Win, fs ee B, = {Wy Wyas- ++) Wn, 


are bases of W,, W,,..., W,, respectively. Then V is the direct sum of the W, if and 
only if the union B = B, U ... U B, is a basis of V. 


Now suppose 7:V — V is linear and V is the direct sum of (nonzero) T-invariant subspaces 
W,,W,...,W,,; that is, 


V=W.9...8W, and T(W,) C W,, i=l,...,r 


Let 7, denote the restriction of T to W,;. Then T is said to be decomposable into the operators T, or T is 
said to be the direct sum of the T;, written T = T; 6... T,. Also, the subspaces W,,..., W, are said to 
reduce T or to form a T-invariant direct-sum decomposition of V. 

Consider the special case where two subspaces U and W reduce an operator T:V — V; say dim U = 2 
and dim W = 3, and suppose {u}, u>} and {w}, w3, w3 } are bases of U and W, respectively. If T) and T, 
denote the restrictions of T to U and W, respectively, then 


T>(w)) = byw, + byw, + by3W3 
T,(wp) = byw, + byw, + bzw 
T,(w3) = b3,w, + byw. + b33W3 


T; (u) = QU, T 42⁄2 
T, (uy) = 2U] T A222 


Accordingly, the following matrices 4, B,M are the matrix representations of 7,, T), T, respectively, 


— |21 aî 
a=] 3 


412 42 


by by, b3 
A 0 
| B= | by bn bsp |, m=] | 
bi; ba b3 
The block diagonal matrix M results from the fact that {u , u3, w1, W2, w3 } is a basis of V (Theorem 10.4), 
and that T(u;) = Tı (u;) and T(w;) = Ty(w,). 
A generalization of the above argument gives us the following theorem. 


THEOREM 10.5: Suppose T:V — V is linear and suppose V is the direct sum of T-invariant 
subspaces, say, W,,...,W,. If A; is a matrix representation of the restriction of 
T to W,, then T can be represented by the block diagonal matrix: 


M — diag(A,, A>, eee ,A,) 


10.5 Primary Decomposition 


The following theorem shows that any operator T:V — V is decomposable into operators whose 
minimum polynomials are powers of irreducible polynomials. This is the first step in obtaining a 
canonical form for T. 
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THEOREM 10.6: (Primary Decomposition Theorem) Let 7:V — V be a linear operator with 
minimal polynomial 


mt) = Ail)" RA” +f)" 


where the f(t) are distinct monic irreducible polynomials. Then V is the direct sum 
of T-invariant subspaces W,,..., W,, where W; is the kernel of f(T)". Moreover, 
f(t)" is the minimal polynomial of the restriction of T to W;. 


The above polynomials f,(t)” are relatively prime. Therefore, the above fundamental theorem 
follows (Problem 10.11) from the next two theorems (proved in Problems 10.9 and 10.10, respectively). 


THEOREM 10.7: Suppose 7:V — V is linear, and suppose f(t) = g(t)A(t) are polynomials such that 
f(T) = 0 and g(t) and A(t) are relatively prime. Then V is the direct sum of the 
T-invariant subspace U and W, where U = Ker g(T) and W = Ker A(T). 


THEOREM 10.8: In Theorem 10.7, if f(t) is the minimal polynomial of T [and g(t) and A(t) are 
monic], then g(t) and A(t) are the minimal polynomials of the restrictions of T to U 
and W, respectively. 


We will also use the primary decomposition theorem to prove the following useful characterization of 
diagonalizable operators (see Problem 10.12 for the proof). 


THEOREM 10.9: A linear operator T:V — V is diagonalizable if and only if its minimal polynomial 
m(t) is a product of distinct linear polynomials. 


THEOREM 10.9: (Alternative Form) A matrix A is similar to a diagonal matrix if and only if its 
minimal polynomial is a product of distinct linear polynomials. 


EXAMPLE 10.3 Suppose 4 Æ I is a square matrix for which 4? = J. Determine whether or not A is similar to a 
diagonal matrix if A is a matrix over: (i) the real field R, (ii) the complex field C. 

Because 4? = J, A is a zero of the polynomial f(t) = Ë — 1 = (t — 1)(? + t + 1). The minimal polynomial m(t) 
of A cannot be ¢t — 1, because A Æ J. Hence, 


mj)=P+t+1 o mpj=P=1 
Because neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the 


other hand, each of the polynomials is a product of distinct linear polynomials over C. Hence, A is diagonalizable 
over C. 


10.6 Nilpotent Operators 


A linear operator T:V — V is termed nilpotent if T” = 0 for some positive integer n; we call k the index 
of nilpotency of T if T% = 0 but T*! Æ 0. Analogously, a square matrix A is termed nilpotent if A” = 0 
for some positive integer n, and of index k if A* = 0 but A‘! Æ 0. Clearly the minimum polynomial of a 
nilpotent operator (matrix) of index k is m(t) = t*; hence, 0 is its only eigenvalue. 


EXAMPLE 10.4 The following two r-square matrices will be used throughout the chapter: 


0 10... 0 O A 
00 1... 0 ee 
NEN =] rars and JAJ] eienaars 


© 
i=) 
= 
- © 
oo 
oo 


oo 
DO 
oo 
oo 
or 
oo 
oo 
oo 
oO œ 
more 
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The first matrix N, called a Jordan nilpotent block, consists of 1’s above the diagonal (called the super- 
diagonal), and 0’s elsewhere. It is a nilpotent matrix of index r. (The matrix N of order 1 is just the 1 x 1 zero 
matrix [0].) 

The second matrix J (å), called a Jordan block belonging to the eigenvalue /, consists of 2’s on the diagonal, 1’s 
on the superdiagonal, and 0’s elsewhere. Observe that 


J(A) =A +N 
In fact, we will prove that any linear operator T can be decomposed into operators, each of which is the sum of a 


scalar operator and a nilpotent operator. 


The following (proved in Problem 10.16) is a fundamental result on nilpotent operators. 


THEOREM 10.10: Let T:V — V be a nilpotent operator of index k. Then T has a block diagonal 
matrix representation in which each diagonal entry is a Jordan nilpotent block N. 
There is at least one N of order k, and all other N are of orders <k. The number of 
N of each possible order is uniquely determined by T. The total number of N of all 
orders is equal to the nullity of 7. 


The proof of Theorem 10.10 shows that the number of N of order i is equal to 2m; — m,,; — m,_), 
where m; is the nullity of T’. 


10.7 Jordan Canonical Form 


An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials factor 
into linear polynomials. This is always true if K is the complex field C. In any case, we can always extend 
the base field K to a field in which the characteristic and minimal polynomials do factor into linear 
factors; thus, in a broad sense, every operator has a Jordan canonical form. Analogously, every matrix is 
similar to a matrix in Jordan canonical form. 

The following theorem (proved in Problem 10.18) describes the Jordan canonical form J of a linear 
operator T. 


THEOREM 10.11: Let 7:V — V be a linear operator whose characteristic and minimal polynomials 
are, respectively, 


A(t) = (t— 21)" e(t- 4," and m(t) = (f- 4, )™---(t-4,)™ 


where the 4; are distinct scalars. Then T has a block diagonal matrix representa- 
tion J in which each diagonal entry is a Jordan block J, = J (4;). For each 4,,, the 
corresponding J; have the following properties: 


ij> 


(i) There is at least one J; of order m,; all other J; are of order < m;. 
(ii) The sum of the orders of the J; is 7;. 
(iii) The number of J; equals the geometric multiplicity of 2,. 
(iv) The number of J; of each possible order is uniquely determined by T. 


EXAMPLE 10.5 Suppose the characteristic and minimal polynomials of an operator T are, respec- 
tively, 
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Then the Jordan canonical form of T is one of the following block diagonal matrices: 
5 1 0 5 1 0 
diag f A É T 05 1 ae ding f l 2, 2], lo 5 1 
0 0 5 0 0 5 


The first matrix occurs if T has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix 
occurs if T has three independent eigenvectors belonging to the eigenvalue 2. 


10.8 Cyclic Subspaces 


Let T be a linear operator on a vector space V of finite dimension over K. Suppose v € V and v Æ 0. The 
set of all vectors of the form f(T)(v), where f(t) ranges over all polynomials over K, is a T-invariant 
subspace of V called the 7-cyclic subspace of V generated by v; we denote it by Z(v, T) and denote the 
restriction of T to Z(v,T) by T,. By Problem 10.56, we could equivalently define Z(v,7) as the 
intersection of all T-invariant subspaces of V containing v. 
Now consider the sequence 
v, T(v), T?(v), T (w), 


of powers of T acting on v. Let k be the least integer such that T*(v) is a linear combination of those 
vectors that precede it in the sequence, say, 


T*(v) = —a,_,T* 1! (v) — +++ — aT (v) — agu 
Then 
m(t) = É +a! +--+ att a 
is the unique monic polynomial of lowest degree for which m,(T)(v) =0. We call m,(t) the 


T-annihilator of v and Z(v,T). 


The following theorem (proved in Problem 10.29) holds. 


THEOREM 10.12: Let Z(v,T), T, m,(t) be defined as above. Then 
(i) The set {v,7(v),...,7* '(v)} is a basis of Z(v, T); hence, dim Z(v, T) = k. 
(ii) The minimal polynomial of T, is m,(t). 


(iii) The matrix representation of T, in the above basis is just the companion 
matrix C(m,,) of m,(t); that is, 


000 G =i 
1 0 0 0 —-a, 
C(m,) = 0 1 0 0 —ay 
0 0 0 0 ak2 
0 0 0 1 —ap_} 


10.9 Rational Canonical Form 


In this section, we present the rational canonical form for a linear operator T:V — V. We emphasize that 
this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall 
that this is not the case for the Jordan canonical form.) 
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LEMMA 10.13: Let T:V — V be a linear operator whose minimal polynomial is f(t)", where f (t) is a 
monic irreducible polynomial. Then V is the direct sum 


V = Z(v, T) 9- Z(v„ T) 
of T-cyclic subspaces Z(v;, T) with corresponding T-annihilators 
SA”, O”, wie JA", n= >M... n, 


Any other decomposition of V into T-cyclic subspaces has the same number of 
components and the same set of T-annihilators. 


We emphasize that the above lemma (proved in Problem 10.31) does not say that the vectors v; or 
other T-cyclic subspaces Z(v;,7) are uniquely determined by T, but it does say that the set of 
T-annihilators is uniquely determined by T. Thus, T has a unique block diagonal matrix representation: 


M = diag(C), C, sey C,) 
where the C, are companion matrices. In fact, the C, are the companion matrices of the polynomials f(t)”. 
Using the Primary Decomposition Theorem and Lemma 10.13, we obtain the following result. 
THEOREM 10.14: Let T:V — V be a linear operator with minimal polynomial 


m(t) = AARO” . f(t)" 


where the f(t) are distinct monic irreducible polynomials. Then T has a unique 
block diagonal matrix representation: 


M = diag(Ci,, Ch, ry Cis sey Cu; Co; e.g Cor) 
where the C; are companion matrices. In particular, the C, are the companion 
matrices of the polynomials f;(t)™, where 
My = Ny 2 Ny 2+ > Mpo TET M, = ng = Ng È = Ng, 
Nij 


The above matrix representation of T is called its rational canonical form. The polynomials f;(t) 
are called the elementary divisors of T. 


EXAMPLE 10.6 Let V be a vector space of dimension 8 over the rational field Q, and let T be a linear operator on 
V whose minimal polynomial is 


m(t) = ARF = (É — 4° + 6f — 4t — 7)(t — 3) 


Thus, because dim V = 8, the characteristic polynomial A(t) = fi (£) ON Also, the rational canonical form M of T 
must have one block the companion matrix of fı (t) and one block the companion matrix of MOR There are two 
possibilities: 


(a) diag[C( — 4° + 6 —4t-7), C((t—3)’), C((t—3)”)] 

(b) diag(C(#* — 4° + 62—4t-7), C((t—3)°), C(t—3),C(t—3)] 

That is, 
000 7 000 7 

oo aoe [2° o 3) (2-2) pho ffi 28 2 ea 
001 4 001 4 


10.10 Quotient Spaces 


Let V be a vector space over a field K and let W be a subspace of V. If v is any vector in V, we write 
u+W for the set of sums v + w with w € W; that is, 


v+W={v+w:we W} 
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These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets partition V into 
mutually disjoint subsets. 


EXAMPLE 10.7 Let W be the subspace of R? defined by 
W = { (a,b) :a = b}, 


that is, W is the line given by the equation x — y = 0. We can view 
v+ W as a translation of the line obtained by adding the vector v WwW 
to each point in W. As shown in Fig. 10-2, the coset v + W is also 
a line, and it is parallel to W. Thus, the cosets of W in R? are 
precisely all the lines parallel to W. x 


In the following theorem, we use the cosets of a subspace 
W of a vector space V to define a new vector space; it is 
called the quotient space of V by W and is denoted by V/W. Figure 10-2 


THEOREM 10.15: Let W be a subspace of a vector space over a field K. Then the cosets of W in V 
form a vector space over K with the following operations of addition and scalar 
multiplication: 


(i) (u+ w) + (v+ W) = (u+ v) +W, (ii) k(u+ W) = ku+W, where k € K 


We note that, in the proof of Theorem 10.15 (Problem 10.24), it is first necessary to show that the 
operations are well defined; that is, whenever u + W =u' + W and v+ W = v + W, then 


(i) (u+ v) +W = (u + v)+W and (ii) ku+ W = ku +W forany ke K 


In the case of an invariant subspace, we have the following useful result (proved in Problem 10.27). 


THEOREM 10.16: Suppose W is a subspace invariant under a linear operator T:V — V. Then T 
induces a linear operator T on V/W defined by T(v + W) = T(v) + W. Moreover, 
if T is a zero of any polynomial, then so is T. Thus, the minimal polynomial of T 
divides the minimal polynomial of T. 


SOLVED PROBLEMS 


Invariant Subspaces 

10.1. Suppose T:V — V is linear. Show that each of the following is invariant under T: 
(a) {0}, (b) V, (c) kernel of 7, (d) image of T. 
(a) We have T(0) = 0 € {0}; hence, {0} is invariant under T. 


(b) For every vE V , T(v) € V; hence, V is invariant under T. 


(c) Let u € Ker T. Then T(u) = 0 € Ker T because the kernel of T is a subspace of V. Thus, Ker T is 
invariant under T. 


(d) Because T(v) € Im T for every v € V, it is certainly true when v € Im T. Hence, the image of T is 
invariant under T. 


10.2. Suppose {W;} is a collection of T-invariant subspaces of a vector space V. Show that the 
intersection W = (), W; is also T-invariant. 


Suppose v € W; then v € W, for every i. Because W, is T-invariant, T(v) € W; for every i. Thus, 
T(v) € W and so W is T-invariant. 
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10.3. 


10.4. 


10.5. 


10.6. 


Prove Theorem 10.2: Let T:V — V be linear. For any polynomial f(t), the kernel of f(T) is 
invariant under T. 


Suppose v € Ker f(7)—that is, f(7)(v) = 0. We need to show that 7(v) also belongs to the kernel of 
Sf (T)—that is, f(T)(T(v)) = (f(1)° T)(v) = 0. Because f(t)t = tf (t), we have f(T)° T=Tof(T). 


Thus, as required, 


(F(T)° T)(v) = (Tef(T))(v) = T(F(T)(a)) = TO) = 0 


2 


Find all invariant subspaces of A = | l 


-5| . 
=| viewed as an operator on R?. 


By Problem 10.1, R? and {0} are invariant under A. Now if A has any other invariant subspace, it must 
be one-dimensional. However, the characteristic polynomial of A is 


A(t) =f —t(A)t+|4J/ =P +1 


Hence, A has no eigenvalues (in R) and so A has no eigenvectors. But the one-dimensional invariant 
subspaces correspond to the eigenvectors; thus, R? and {0} are the only subspaces invariant under A. 


Prove Theorem 10.3: Suppose W is T-invariant. Then T has a triangular block representation 


0 CY’ 
We choose a basis {w,,...,w,} of W and extend it to a basis {w,,...,W,,0,,---, Us} of V. We have 
T(w1) = T(w) = awi +--+ aw, 


A B : ; gen 
| | where A is the matrix representation of the restriction T of T to W. 


T(v,) = buwi +4 r H Cy Uy He + Cys Us 
T (v2) = bawi + +++ + ba W, + C910 + +++ + Cae Ug 
T(v, = bawi mea byw, } Cs1 Uy Pres Cys Us 


But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of 


B 7 
0 ol where A is the transpose of the matrix of 
coefficients for the obvious subsystem. By the same argument, A is the matrix of 7 relative to the basis {w;} 


equations (Section 6.2). Therefore, it has the form i 


of W. 


Let 7 denote the restriction of an operator T to an invariant subspace W. Prove 


(a) For any polynomial f(t), f(7)(w) =f(T)(w). 
(b) The minimal polynomial of T divides the minimal polynomial of 7. 


(a) If f(t) =0 or if f(t) is a constant (i.e., of degree 1), then the result clearly holds. 
Assume deg f = n > 1 and that the result holds for polynomials of degree less than n. Suppose that 


f(t) =a," +a, 40! +--+ at + ao 


Then Fw) = (a f" + ap P +- +anw 
= (a P (Ê(w)) + (aya P! +++ + aol) (w) 
= (a T” )(T(w)) + (an1 T"! +-+- + aol) (w) = f(T) (w) 


(b) Let m(t) denote the minimal polynomial of T. Then by (a), m(T)(w) = m(T)(w) = 0(w) = 0 for 
every w € W; that is, T is a zero of the polynomial m(t). Hence, the minimal polynomial of T divides 


m(t). 
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Invariant Direct-Sum Decompositions 


10.7. 


10.8. 


10.9. 


Prove Theorem 10.4: Suppose W,, W2, ..., W, are subspaces of V with respective bases 
By = {Wy1,W125-++s Win, fo eae B, = {Wy W25 -3 Wn, J 
Then V is the direct sum of the W; if and only if the union B = |J; B; is a basis of V. 


Suppose B is a basis of V. Then, for any v € V, 


= P E fee a oh = aie 
v = 4 Wy + F ain, Win T r a,W,1 7 | Ary Wen, = Wy EW + +w, 


where w; = aW; + +++ + Gin, Win, E Wi. We next show that such a sum is unique. Suppose 


v=w, +w + +w., where w €W, 
Because {w;,.-.., Win, } is a basis of W,, w; = bawi + +++ + Din, Win,» and so 
v= biwir aeaea bin, Win, poewasa baw, pes Brn, Wn, 


Because B is a basis of V, ay = bij, for each i and each j. Hence, w; = w}, and so the sum for v is unique. 


Accordingly, V is the direct sum of the W,. 


Conversely, suppose V is the direct sum of the W;. Then for any v € V, v = w; +---+w,, where 


w; € W;. Because {wi } is a basis of W,, each w; is a linear combination of the w,,, and so v is a linear 


combination of the elements of B. Thus, B spans V. We now show that B is linearly independent. Suppose 


aW ro + ain, Win, Fost + aawat arn, Wm, = 0 
Note that a;ıw;; +--+ + ain, Win, € W;. We also have that 0 = 0 + 0- - -0 € W;. Because such a sum for 0 is 
unique, 
AW Hie An Win, =O for i=1,...,7 


The independence of the bases {wi } implies that all the a’s are 0. Thus, B is linearly independent and is a 
basis of V. 


Suppose 7:V — V is linear and suppose T = T, © T, with respect to a T-invariant direct-sum 
decomposition V = U ® W. Show that 


(a) m(t) is the least common multiple of m,(t) and m (t), where m(t), m(t), m(t) are the 
minimum polynomials of T, 7, T}, respectively. 

(b) A(t) = A,(t)A,(t), where A(t), A(t), A (t) are the characteristic polynomials of T, T,, T9, 
respectively. 


(a) By Problem 10.6, each of m; (t) and m,(t) divides m(t). Now suppose f(t) is a multiple of both m,(t) 
and m(t), then f(7,;)(U) = 0 and f(T,)(W) = 0. Let v € V, then v = u + w with u € U and w € W. 
Now 


f(Tv=f(Du t+ f(Dyw =f(T)u+-f(T)w = 0+0=0 


That is, T is a zero of f(t). Hence, m(t) divides f(t), and so m(t) is the least common multiple of m; (t) 
and my(t). $ 
(b) By Theorem 10.5, T has a matrix representation M = 0 , where A and B are matrix representations 
; : 0 B 
of T, and T,, respectively. Then, as required, 


tl—A 0 


A(t) = | — M| = 0 IB 


| = | — Alltel — B| = A,(t)A, (1) 


Prove Theorem 10.7: Suppose 7:V — V is linear, and suppose f(t) = g(t)h(t) are polynomials 
such that f(T) = 0 and g(t) and h(t) are relatively prime. Then V is the direct sum of the 
T-invariant subspaces U and W where U = Ker g(T) and W = Ker A(T). 
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10.10. 


10.11. 


10.12. 


Note first that U and W are T-invariant by Theorem 10.2. Now, because g(t) and A(t) are relatively 
prime, there exist polynomials r(t) and s(t) such that 


r(t)g(t) + s(t)A(t) = 1 
Hence, for the operator T, r(T)g(T) +s(T)A(T) =1 (*) 
Let v € V; then, by (*), v=r(T)g(T)v+s(T)A(T)v 
But the first term in this sum belongs to W = Ker A(T), because 
A(T)r(T)g(T)v = r(T)e(T)h(T)v = r(T)f (T)v = r(T)0v = 0 
Similarly, the second term belongs to U. Hence, V is the sum of U and W. 


To prove that V = U ® W, we must show that a sum v = u+ w with u € U, w € W, is uniquely 
determined by v. Applying the operator r(T)g(T) to v = u +w and using g(T)u = 0, we obtain 


r(T)g(T)v = r(T)g(T)u + r(T)g(T)w = r(T)g(T)w 
Also, applying (*) to w alone and using h(T)w = 0, we obtain 
w=r(T)g(T)w4+ s(T)A(T)w = r(T)g(T)w 


Both of the above formulas give us w = r(T)g(T)v, and so w is uniquely determined by v. Similarly u is 
uniquely determined by v. Hence, V = U © W, as required. 


Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f(¢) is the minimal polynomial of T 
(and g(t) and A(t) are monic), then g(t) is the minimal polynomial of the restriction T, of T to U 
and A(t) is the minimal polynomial of the restriction T, of T to W. 


Let m,(t) and m,(t) be the minimal polynomials of T, and 7, respectively. Note that g(7,) = 0 and 
h(T,) = 0 because U = Ker g(T) and W = Ker A(T). Thus, 


m,(t) divides g(t) and m(t) divides A(t) (1) 


By Problem 10.9, f (t) is the least common multiple of m; (t) and m, (t). But m, (t) and m,(f) are relatively 
prime because g(t) and h(t) are relatively prime. Accordingly, f(t) = mı (t)m,(t). We also have that 


f(t) = g(t)h(t). These two equations together with (1) and the fact that all the polynomials are monic imply 


that g(t) = m(t) and A(t) = m,(t), as required. 


Prove the Primary Decomposition Theorem 10.6: Let 7:V — V be a linear operator with 
minimal polynomial 


m(t) =A A ROP .. A(t)" 


where the f(t) are distinct monic irreducible polynomials. Then V is the direct sum of T- 
invariant subspaces W,,..., W, where W, is the kernel of f(T)". Moreover, f(t)” is the minimal 
polynomial of the restriction of T to W;. 


The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been proved for 
r — 1. By Theorem 10.7, we can write V as the direct sum of T-invariant subspaces W, and V,, where W} is 
the kernel of f,(7)"' and where V, is the kernel of f(T)” --- f(T)". By Theorem 10.8, the minimal 
polynomials of the restrictions of T to W, and V, are fi (t)™! and f(t)" ---f.(t)™, respectively. 

Denote the restriction of T to V, by f |. By the inductive hypothesis, V, is the direct sum of subspaces 
W,,...,W, such that W; is the kernel of f;(7,)” and such that f(t)” is the minimal polynomial for the 


restriction of 7, to W;. But the kernel of f(T)", for i= 2,...,r is necessarily contained in V,, because 


f(t)" divides f(t)” ---f,(t)"". Thus, the kernel of f(T)" is the same as the kernel of f(7,), which is W;. 


Also, the restriction of T to W; is the same as the restriction of T, to W; (for i = 2,...,r); hence, f(t)" is 
also the minimal polynomial for the restriction of T to W;. Thus, V = W, 6 W, ®--- @ W, is the desired 
decomposition of T. 


Prove Theorem 10.9: A linear operator T:V — V has a diagonal matrix representation if and only 
if its minimal polynomal m(t) is a product of distinct linear polynomials. 
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Suppose m(t) is a product of distinct linear polynomials, say, 
m(t) = (t— Ay)(t— dy) ++ (E= 4) 
where the /; are distinct scalars. By the Primary Decomposition Theorem, V is the direct sum of subspaces 


W,,...,W,, where W; = Ker(T — 4,1). Thus, if v € W;, then (T — 1,J)(v) = 0 or T(v) = /,v. In other 


I 


words, every vector in W; is an eigenvector belonging to the eigenvalue 4;. By Theorem 10.4, the union of 
bases for W,,..., W, is a basis of V. This basis consists of eigenvectors, and so T is diagonalizable. 


Conversely, suppose 7 is diagonalizable (i.e., V has a basis consisting of eigenvectors of 7). Let 
2,,---,4, be the distinct eigenvalues of T. Then the operator 
S(T) = (T HAT = 21) eT a) 


maps each basis vector into 0. Thus, f(T) = 0, and hence, the minimal polynomial m(t) of T divides the 
polynomial 


F(t) = (t= 21)(t = 22) (= Ast) 


Accordingly, m(t) is a product of distinct linear polynomials. 


Nilpotent Operators, Jordan Canonical Form 
10.13. Let 7:V be linear. Suppose, for v € V, T*(v) = 0 but 7’!(v) Æ 0. Prove 
(a) The set S = {v,7(v),..., T! (v)} is linearly independent. 
(b) The subspace W generated by S is T-invariant. 
(c) The restriction T of T to W is nilpotent of index k. 
(d) Relative to the basis {7*~'(v),...,7(v), v} of W, the matrix of T is the k-square Jordan 
nilpotent block N, of index k (see Example 10.5). 
(a) Suppose 
av + a,T(v) +a,T?(v) +--+ ap T (v) = 0 (*) 


Applying T*~! to (*) and using T*(v) = 0, we obtain aT*!(v) = 0; because T'!(v) Æ 0, a = 0. 
Now applying 7*~? to (*) and using T*(v) = 0 and a = 0, we fiind a,7*~'(v) = 0; hence, a, = 0. 
Next applying 7*~3 to (*) and using 7*(v) = 0 and a =a, = 0, we obtain a,T*~!(v) = 0; hence, 
a, = 0. Continuing this process, we find that all the a’s are 0; hence, S is independent. 


(b) Let v € W. Then 
v= by+b,T(v) + b T? (v) hs 55 Se b, TE" (v) 
Using T*(v) = 0, we have 
T(v) = bT(v) +b, T (v) +--+ b; aT (v) EW 
Thus, W is T-invariant. 


(c) By hypothesis, T*(v) = 0. Hence, for i = 0,...,k— 1, 
P(T'(v)) = Tu) =0 


That is, applying T* to each generator of W, we obtain 0; hence, T* = 0 and so T is nilpotent of index 
at most k. On the other hand, 7*!(v) = T*~!(v) 4 0; hence, T is nilpotent of index exactly k. 


(d) For the basis {7*-!(v), T*-?(v),...,7(v), v} of W, 
hea) = Fao 
HTH) = TH- (u) 
Tla) = T20) 
f ae Seas eee a ee 
T(v) = T(v) 


Hence, as required, the matrix of T in this basis is the k-square Jordan nilpotent block N,. 
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10.14. 


10.15. 


10.16. 


Let T:V — V be linear. Let U = Ker T! and W = Ker T‘*!. Show that 
(a) UCW, b) T(W) CU. 
(a) Suppose u€ U = Ker T'. Then T'(u)=0 and so T™+!(u)= 7(T'(u)) = T(0)=0. Thus, 
u € Ker T’*! = W. But this is true for every u € U; hence, U C W. 
(b) Similarly, if w € W = Ker T'+!, then Ti! (w) = 0. Thus, Ti+! (w) = T'(T(w)) = T'(0) = 0 and so 
T(W) CU. 
Let T:V be linear. Let X = Ker T*-*, Y = Ker Ti !, Z = Ker T'. Therefore (Problem 10.14), 
X C YC Z. Suppose 
{iigesna thts {ui og Up U Uha {thin cas th Ugye ey Ug Wia- Wi} 
are bases of X, Y, Z, respectively. Show that 
S = {u;,... Up, T Wise say T(w)} 
is contained in Y and is linearly independent. 


By Problem 10.14, T(Z) C Y, and hence S C Y. Now suppose S is linearly dependent. Then there 
exists a relation 
aju +: + a,u, +biT(wi) +: +5,T(w,) = 0 
where at least one coefficient is not zero. Furthermore, because {u;} is independent, at least one of the b, 
must be nonzero. Transposing, we find 


by T(w,) +--+ +b, T(w,) = -ayu — -+ — a,u, E€ X = Ker Ti? 
Hence, T}? (bi T(w,) +--+. b,T(w,)) = 0 
Thus, Ti! (byw, +--+ +b,w,) =0, andso byw; +--+ bw, € Y = Ker T™! 


Because {u;, v;} generates Y, we obtain a relation among the u;, v;, w, where one of the coefficients (i.e., 


one of the b) is not zero. This contradicts the fact that {u;, v;, wą} is independent. Hence, S must also be 
independent. 


Prove Theorem 10.10: Let T:V — V be a nilpotent operator of index k. Then T has a unique 
block diagonal matrix representation consisting of Jordan nilpotent blocks N. There is at least 
one N of order k, and all other N are of orders < k. The total number of N of all orders is equal to 
the nullity of 7. 

Suppose dim V =n. Let W, = Ker T, W, = Ker T’,...,W, = Ker T*. Let us set m; = dim W,, for 
i=1,...,k. Because T is of index k, W, = V and W,_, Æ V and so m_, < m, =n. By Problem 10.14, 
W,CW,C--- CW, =V 

Thus, by induction, we can choose a basis {u,,...,,} of V such that {u,,...,Up,} is a basis of W;. 


We now choose a new basis for V with respect to which T has the desired form. It will be convenient 
to label the members of this new basis by pairs of indices. We begin by setting 


v(1,k) = Um, +1 v(2, k) = Um, 42> verg v(m; Z m1, k) = Um, 
and setting 
v(1,k — 1) = To(1,4), v(2,k — 1) = Tv(2,k), et v(m, — Mp1, — 1) = Tv(m, — my_1,k) 
By the preceding problem, 


S, = {uy,...,u v(1,k—1),..., v(m, — mp_1,k — 1)} 


M2? 


is a linearly independent subset of W,_,. We extend S, to a basis of W,_, by adjoining new elements (if 
necessary), which we denote by 


v(m, — m, +1, k-1), v(m, — m, +2, k-1), ey v(m- — My2, k — 1) 
Next we set 
v(1,k — 2) = Tv(1,k - 1), v(2,k — 2) = Tv(2,k — 1), piw 
v(m- — My_2,k — 2) = To(m_, — my_p,k — 1) 
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Again by the preceding problem, 
Sy = {uy,---; Um,» U1, k — 2),..., v(Mmg-1 — my_z,k — 2)} 


is a linearly independent subset of W,_,, which we can extend to a basis of W,_, by adjoining elements 


v(m- — Mp_2 + 1,k — 2), U(my_| — Mg_p + 2,k — 2) v(my_2 — My_3,k — 2) 


Continuing in this manner, we get a new basis for V, which for convenient reference we arrange as follows: 


v(1, k) .-., (My — My_1,k) 

v(l,k— 1), ..., v(m, — mg_y,k — 1) ...,0(Mm 1 — mp2, k — 1) 

v(1,2), .-., (Mg — My1; 2), <., U(Mg_1 — My2; 2), ++, U(m — my, 2) 

v(1, 1), 2, (Mg — m1, 1), <<., U(Mg_1 — My_2, 1), 2+, v(m — m, 1), ..., v(m, 1) 


The bottom row forms a basis of W,, the bottom two rows form a basis of W}, and so forth. But what is 
important for us is that T maps each vector into the vector immediately below it in the table or into 0 if the 
vector is in the bottom row. That is, 


..,  fvij-1) forj>1 
Toli, j) = for j=1 


Now it is clear [see Problem 10.13(d)] that T will have the desired form if the v(i,j) are ordered 
lexicographically: beginning with v(1, 1) and moving up the first column to v(1, k), then jumping to v(2, 1) 
and moving up the second column as far as possible. 

Moreover, there will be exactly m, — m,_, diagonal entries of order k. Also, there will be 


(mı — Mp_) — (my — my1) = 2m,- — mM — m,- diagonal entries of order k — 1 
2m, — mı — m3 diagonal entries of order 2 
2m, — m diagonal entries of order 1 
as can be read off directly from the table. In particular, because the numbers m,,...,m, are uniquely 


determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the 
identity 


mı = (m; — mg-1) + (2m;-1 — my — Mg-2) + +++ + (2m, — m, — m3) + (2m, — m) 


shows that the nullity m, of T is the total number of diagonal entries of T. 


0 1 1 0 1 01100 
0 0 1 1 1 0 0 1 1 1 
Le A=|0 0 0 0 O0|andB=]|0O0 0 O 1 1|. The reader can verify that A and B 
00000 00000 
00000 00000 


are both nilpotent of index 3; that is, Æ? = 0 but 4? Æ 0, and B? = 0 but B? Æ 0. Find the 
nilpotent matrices M, and Mg in canonical form that are similar to A and B, respectively. 


Because A and B are nilpotent of index 3, M, and Mg must each contain a Jordan nilpotent block of 
order 3, and none greater then 3. Note that rank(A) = 2 and rank(B) = 3, so nullity(A) = 5 — 2 = 3 and 
nullity(B) = 5 — 3 = 2. Thus, M, must contain three diagonal blocks, which must be one of order 3 and 
two of order 1; and Mz must contain two diagonal blocks, which must be one of order 3 and one of order 2. 
Namely, 


0100 0 01000 
00100 00100 
M,=|0 0000 and M,=|0 0000 
00000 00001 
00000 00000 
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10.18. Prove Theorem 10.11 on the Jordan canonical form for an operator T. 


By the primary decomposition theorem, T is decomposable into operators 7),...,7,; that is, 
T=T,@®---@T,, where (t— 2,)”" is the minimal polynomial of 7;. Thus, in particular, 
(7, -A,t)" =0, ..., (T, — AD" =0 


Set N; = T, — AJ. Then, for i= 1,...,r, 
T,=N,+A,, where NT =0 


That is, T, is the sum of the scalar operator 1,/ and a nilpotent operator N;, which is of index m; because 
(t — A;);" is the minimal polynomial of T,. 

Now, by Theorem 10.10 on nilpotent operators, we can choose a basis so that N; is in canonical form. 
In this basis, T, = N; + 4,/ is represented by a block diagonal matrix M; whose diagonal entries are the 
matrices J. The direct sum J of the matrices M; is in Jordan canonical form and, by Theorem 10.5, is a 
matrix representation of 7. 

Last, we must show that the blocks J;; satisfy the required properties. Property (i) follows from the fact 
that N; is of index m,. Property (ii) is true because T and J have the same characteristic polynomial. Property 
(iii) is true because the nullity of N; = T, — 4,J is equal to the geometric multiplicity of the eigenvalue 4,. 
Property (iv) follows from the fact that the T; and hence the N; are uniquely determined by T. 


10.19. Determine all possible Jordan canonical forms J for a linear operator T:V — V whose 
ope . 5 p . 2: 
characteristic polynomial A(t) = (t — 2)° and whose minimal polynomial m(t) = (t — 2)°. 

J must be a 5 x 5 matrix, because A(t) has degree 5, and all diagonal elements must be 2, because 2 is 


the only eigenvalue. Moreover, because the exponent of t — 2 in m(t) is 2, J must have one Jordan block of 
order 2, and the others must be of order 2 or 1. Thus, there are only two possibilities: 


J= diag(|? ah É I 21) or J= äiag( [° h Ql, Ql, 2i) 


10.20. Determine all possible Jordan canonical forms for a linear operator T:V — V whose character- 
istic polynomial A(t) = (t — 2)° (t — 5). In each case, find the minimal polynomial m(f). 
Because t — 2 has exponent 3 in A(t), 2 must appear three times on the diagonal. Similarly, 5 must 
appear twice. Thus, there are six possibilities: 


2 l 2 1 


(a) diag 2 7 ; [ J , (b) diag 2 > , 5), Bli 
© diag(|? 5], a [° §])- @ aef? 3]. e (5. e), 


© dg(p g 2) [> sJ) © se, I, Bh s (5) 


The exponent in the minimal polynomial m(t) is equal to the size of the largest block. Thus, 


(a) m(t)=(t—2)°(t-5)*, 6) mA =(t-2)°(¢-5), (©) me) = (t-2)°(¢- 5)’, 
(d) m(t)=(t—2)°(t-5), (©) m(t)=(t-2)(t-5), © mA = (t-2)(t-5) 


Quotient Space and Triangular Form 
10.21. Let W be a subspace of a vector space V. Show that the following are equivalent: 
G) wevt+W, Gi) u—vewW, Gii) veut+wW. 


Suppose u € v+ W. Then there exists wọ E€ W such that u = v + wọ. Hence, u— v= wọ E€ W. 
Conversely, suppose u — v € W. Then u— v= wọ where wọ € W. Hence, u = v + wọ € v+ W. Thus, 
(i) and (ii) are equivalent. 

We also have u— v € W iff — (u — v) = v— u € W iffv € u+ W. Thus, (ii) and (iii) are also 
equivalent. 
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10.23. 


10.24. 


10.25. 


10.26. 


Prove the following: The cosets of W in V partition V into mutually disjoint sets. That is, 


(a) Any two cosets u + W and v+ W are either identical or disjoint. 
(b) Each v € V belongs to a coset; in fact, v € v + W. 


Furthermore, u + W = v+ W if and only if u — v € W, and so (v+ w) + W = v+ W for any 
wEN. 


Let v € V. Because 0 € W, we have v = v + 0 € v+ W, which proves (b). 

Now suppose the cosets u+ W and v + W are not disjoint; say, the vector x belongs to both u + W 
and v + W. Then u — x € W and x — v € W. The proof of (a) is complete if we show that u + W = v + W. 
Let u + wọ be any element in the coset u + W. Because u — x, x — v, wọ belongs to W, 


(u + wọ) — v = (u — x) + (x — v) +W EW 


Thus, u + wọ € v + W, and hence the cost u+ W is contained in the coset v + W. Similarly, v + W is 
contained in u + W, and so u + W = v + W. 

The last statement follows from the fact that u+ W = v+ W if and only if u € v+ W, and, by 
Problem 10.21, this is equivalent to u — v € W. 


Let W be the solution space of the homogeneous equation 2x + 3y + 4z = 0. Describe the cosets 
of W in R°. 


W is a plane through the origin O = (0,0,0), and the cosets of W are the planes parallel to W. 
Equivalently, the cosets of W are the solution sets of the family of equations 


2x + 3y + 4z =k, keR 


In fact, the coset v + W, where v = (a,b,c), is the solution set of the linear equation 
2x + 3y + 4z = 2a + 3b + 4c or 2(x—a)+3(y—-b)+4(z-c)=0 


Suppose W is a subspace of a vector space V. Show that the operations in Theorem 10.15 are well 
defined; namely, show that if u+ W = u + W and v+ W = v + W, then 


(a) (u+) +W = (u +V)+W and (b) ku+W =k +W forany keK 
(a) Because u+ W =u' +W and v+ W =v +W, both u—u' and v—v' belong to W. But then 
(u + v) — (u + v') = (u — u’) + (v — v') € W. Hence, (u + v) + W = (u + v') + W. 


(b) Also, because u—u' € W implies k(u— u) € W, then ku-— ku' = k(u— u’) € W; accordingly, 
ku+ W = ku + W. 


Let V be a vector space and W a subspace of V. Show that the natural map y: V —> V/W, defined 
by (v) = v + W, is linear. 
For any u,v € V and any k € K, we have 
n(u+ v) =u+v+W =u+W+v+W 
and n(kv) = kv+ W = k(v+ W) =k 


Sill 
A 
es 
eS aS 
= 
© 
+ 
23 
A 
im} 
SF 


Accordingly, 7 is linear. 


Let W be a subspace of a vector space V. Suppose {w,,...,w,} is a basis of W and the set of 


cosets {0,,...,0,}, where 0; = v, + W, is a basis of the quotient space. Show that the set of 


vectors B = {v,...,Us, Wj,---,W,} is a basis of V. Thus, dim V = dim W + dim(V/W). 
Suppose u € V. Because {%;} is a basis of V/W, 
u = u+ W =a,0, tan +--+ + 4,0, 


Hence, u = ajv; +-+- + a,v, +w, where w € W. Since {w,} is a basis of W, 


— ee | 1 { ... 4 
u=ayv,4 Ha, v, + byw, 4 + bw, 
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Accordingly, B spans V. 
We now show that B is linearly independent. Suppose 


Ciu tee teu, + diw +: dw, =0 (1) 
Then cd, +--+ +60, =0=W 


Because {%;} is independent, the c’s are all 0. Substituting into (1), we find djw, +--+. +d,w, = 0. 
Because {w;} is independent, the d’s are all 0. Thus, B is linearly independent and therefore a basis of V. 


Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator T:V — V. Then 
T induces a linear operator T on V/W defined by T(v + W) = T(v) + W. Moreover, if T is a 
zero of any polynomial, then so is T. Thus, the minimal polynomial of T divides the minimal 
polynomial of 7. 


We first show that T is well defined; that is, if u+ W = v + W, then T(u + W) = T(v+ W). If 
u+ W = v + W, then u — v € W, and, as W is T-invariant, T(u — v) = T(u) — T(v) € W. Accordingly, 


T(u+ W)=T(u) +W =T(v) +W =T(v+W) 


as required. E 
We next show that T is linear. We have 


T((u+ W)+ (v+ W))=T(u+v+W)=T(u+ v) +W = T(u)+T(v) +W 
T(u) +WH+T(v) +WH=Tut+W)+T(v+W) 


Furthermore, 
T(k(u+ W)) = T(ku+ W) = T(ku) +W = kT(u) + W = k(T(u) + W) = kT(u+ W) 
Thus, 7 is linear. 
Now, for any coset u + W in V/W, 
T(u+W)=T*(u)+W=T(T(u)) +W = T(T(u) + W) = T(T(u + W)) = (u+ W) 
Hence, T? = T?. Similarly, T” = T” for any n. Thus, for any polynomial 
f(t) =at +--+ +49 = at 
FD w+ W) =f(1)(u) +W = Da Tu) +W =Z a(T'(u) +W) 
= YaTi(ut W) = a; T'(u+ W) = ($a; T')(u+ W) =f(T)(ut+ W) 


and so f(T) = f (T). Accordingly, if T is a root of f (t) then f(T) = 0 = W = f(T); that is, T is also a root 
of f(t). The theorem is proved. 


Prove Theorem 10.1: Let T:V — V be a linear operator whose characteristic polynomial factors 
into linear polynomials. Then V has a basis in which T is represented by a triangular matrix. 


The proof is by induction on the dimension of V. If dim V = 1, then every matrix representation of T 
is a 1 x 1 matrix, which is triangular. 

Now suppose dim V = n > 1 and that the theorem holds for spaces of dimension less than n. Because 
the characteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at 
least one nonzero eigenvector v, say T(v) = a,,v. Let W be the one-dimensional subspace spanned by v. 
Set V = V/W. Then (Problem 10.26) dim V = dim V — dim W = n — 1. Note also that W is invariant 
under T. By Theorem 10.16, T induces a linear operator T on V whose minimal polynomial divides the 
minimal polynomial of 7. Because the characteristic polynomial of T is a product of linear polynomials, 
so is its minimal polynomial, and hence, so are the minimal and characteristic polynomials of T. Thus, V 
and T satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis {7,,...,0,} of V 
such that 


(03) = a320 + 4330, 


Si 


T(%,) = Amn T 4,303 AE Ann Un 
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Now let v,..-,v, be elements of V that belong to the cosets v7,...,v,,, respectively. Then {v, 02,---, Un} 
is a basis of V (Problem 10.26). Because T(v) = a120, we have 


T(%) = d U = 0, and so T(v) — yv E W 


But W is spanned by v; hence, T(v3) — an% is a multiple of v, say, 


T (Uy) — anu = az, and so T (Uy) = ay, V + ay Vy 
Similarly, for i = 3,...,” 
T(v;) — an0 — agv — +++ — ayt EW, and so T (uj) = av + apv + +++ + Aiti 
Thus, 
T(v) =a, 


T (v2) = a1 0 + an0 
T(v,) = Ap V + ap 0 + ` F Ann Uy 


and hence the matrix of T in this basis is triangular. 


Cyclic Subspaces, Rational Canonical Form 


10.29. Prove Theorem 10.12: Let Z(v, T) be a T-cyclic subspace, T, the restriction of T to Z(v, T), and 
m,(t) = t + a,_,t&! +--+- + a the T-annihilator of v. Then, 


(i) The set {v,7(v),...,7* '(v)} is a basis of Z(v, T); hence, dim Z(v, T) = k. 
(ii) The minimal polynomial of T, is m,(t). 
(iii) The matrix of T, in the above basis is the companion matrix C = C(m,) of m,(t) [which 


has 1’s below the diagonal, the negative of the coefficients ap, a,,...,a,_, of m,(t) in the 
last column, and 0’s elsewhere]. 


(i) By definition of m,(t), T*(v) is the first vector in the sequence v, T(v), T?(v),... that, is a linear 
combination of those vectors that precede it in the sequence; hence, the set B = {v, T(v),..., 7" !(v)} is 
linearly independent. We now only have to show that Z(v, 7) = L(B), the linear span of B. By the above, 
T*(v) € L(B). We prove by induction that T”(v) € L(B) for every n. Suppose n>k and 
T”! (v) € L(B)—that is, 7"-'(v) is a linear combination of v,...,7*!(v). Then 
T"(v) = T(T”-!(v)) is a linear combination of T(v),...,7*(v). But T*(v) € L(B); hence, 
T"(v) € L(B) for every n. Consequently, f(T)(v) € L(B) for any polynomial f(t). Thus, 
Z(v, T) = L(B), and so B is a basis, as claimed. 

(ii) Suppose m(t) = t + b,_,;f~! +--+ + do is the minimal polynomial of 7,,. Then, because v € Z(v, T), 

0 = m(T,)(v) = m(T)(v) = T (v) +.B, T°™ (v) ++ + bov 


Thus, 7°(v) is a linear combination of v, T(v),...,7°~'(v), and therefore k < s. However, 
m,(T) =0 and so m,(T,) = 0. Then m(t) divides m,(t), and so s < k. Accordingly, k = s and 
hence m,(t) = m(t). 


5 7,(») = (2) 
P(T) = 1%(v) 
gg Teee = 
Li = Tu) =—ayu—ajT(v) — ayF*(v) — + — ay PH"(0) 


By definition, the matrix of T, in this basis is the tranpose of the matrix of coefficients of the above 
system of equations; hence, it is C, as required. 


10.30. Let T:V — V be linear. Let W be a T-invariant subspace of V and T the induced operator on 
V/W. Prove 


(a) The T-annihilator of v € V divides the minimal polynomial of T. 
(b) The 7-annihilator of « € V/W divides the minimal polynomial of T. 
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(a) The 7-annihilator of v € V is the minimal polynomial of the restriction of T to Z(v, T); therefore, by 
Problem 10.6, it divides the minimal polynomial of 7. 


(b) The 7-annihilator of v € V/W divides the minimal polynomial of T, which divides the minimal 
polynomial of T by Theorem 10.16. 


Remark: In the case where the minimum polynomial of T is f(t)", where f (t) is a monic irreducible 
polynomial, then the 7-annihilator of v € V and the T-annihilator of 0 € V/W are of the form f(t)”, where 
m<n. 


n 
> 


Prove Lemma 10.13: Let T:V — V be a linear operator whose minimal polynomial is f(t) 
where f(t) is a monic irreducible polynomial. Then V is the direct sum of T-cyclic subspaces 
Z; = Z(v;,T), i= 1,...,r, with corresponding 7-annihilators 


SOSO”, SO, n= n] 2M AU, 


Any other decomposition of V into the direct sum of T-cyclic subspaces has the same number of 
components and the same set of T-annihilators. 


The proof is by induction on the dimension of V. If dim V = 1, then V is T-cyclic and the lemma 
holds. Now suppose dim V > 1 and that the lemma holds for those vector spaces of dimension less than 
that of V. 

Because the minimal polynomial of T is f(t)”, there exists v} € V such that f(T)" (v) # 0; hence, 
the T-annihilator of v; is f(t)". Let Z, = Z(v,, T) and recall that Z, is T-invariant. Let V = V/Z, and let T 
be the linear operator on V induced by T. By Theorem 10.16, the minimal polynomial of T divides f(t)”; 
hence, the hypothesis holds for V and T. Consequently, by induction, V is the direct sum of T-cyclic 
subspaces; say, 

V =Z(%,T) O---O@Z(% 
where the corresponding 7-annihilators are f(t)"",...,f(t)", n > m > -+ > n, 

We claim that there is a vector v, in the coset ū whose T-annihilator is f(t)”, the T-annihilator of 7. 

Let w be any vector in v,. Then f(T)” (w) € Z,. Hence, there exists a polynomial g(t) for which 


F(T)” (w) = g(T)(%) (1) 


Because f(t)” is the minimal polynomial of T, we have, by (1), 
0 = f(T)" (w) = f(T)" ™8(T) (0) 


But f(t)” is the T-annihilator of v,; hence, f(t)” divides f(t)” g(t), and so g(t) = f(t)" h(t) for some 
polynomial A(t). We set 


v = w—h(T)(v) 
Because w— v, =/A(T)(v,) € Zi, v also belongs to the coset v,. Thus, the T-annihilator of v, is a 
multiple of the 7-annihilator of v. On the other hand, by (1), 
S(T)” (v2) = f(T)" (w = h(T)(%)) = F(T)? (w) — g(T)(v1) = 0 

Consequently, the T-annihilator of v, is f(t)”, as claimed. 

Similarly, there exist vectors v3,...,v, € V such that v; € v; and that the 7-annihilator of v; is f (t)", 
the 7-annihilator of v;. We set 

Z =Z(%,T), stg Z, = Z(v,,T) 


Let d denote the degree of f (t), so that f(t)” has degree dn;. Then, because f(t)” is both the T-annihilator 
of v; and the 7-annihilator of v;, we know that 


{o Th To} and — {5.7 (G),..., T”) 


are bases for Z(v;, T) and Z(¥;,,T), respectively, for i=2,...,r. But V = Z(%,T) ® --- 8 Z, T); 
hence, 


{h -. T T), ies (at 


r’ 
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is a basis for V. Therefore, by Problem 10.26 and the relation T'(v) = T'(v) (see Problem 10.27), 
Pid sang FO OS Wwe EO UG iiss EN ROO CY 


r 


is a basis for V. Thus, by Theorem 10.4, V = Z(v,,T) ®---® Z(v,,T), as required. 


It remains to show that the exponents n;,..., n, are uniquely determined by T. Because 
d = degree of f (t), 
dim V = d(n, +---+7n,) and dim Z; = dn,, i=1,...,r 


Also, if s is any positive integer, then (Problem 10.59) f(T)*(Z;) is a cyclic subspace generated by 
f(T)*(v;), and it has dimension d(n; — s) if n; > s and dimension 0 if n; < s. 
Now any vector v € V can be written uniquely in the form v= w,+---+w,, where w; € Z;. 
Hence, any vector in f(7)*(V) can be written uniquely in the form 


FTF (0) =F (TY (wi) + + A(T)" w,) 
where f(T)"(w,;) € f(T) (Z;). Let t be the integer, dependent on s, for which 


n >s, Eae n, >s, Ng 2S 
Then F(TY(V) =F(TY (Z) S- OF (T)"(Z) 
and so dim[ f(T)'(V)] = al(m — s) ++ (m, — 5) (2) 


The numbers on the left of (2) are uniquely determined by T. Set s = n — 1, and (2) determines the number 
of n; equal to n. Next set s = n — 2, and (2) determines the number of n; (if any) equal to n — 1. We repeat 
the process until we set s = 0 and determine the number of n; equal to 1. Thus, the n, are uniquely 
determined by 7 and V, and the lemma is proved. 


10.32. Let V be a seven-dimensional vector space over R, and let 7:V — V be a linear operator with 
minimal polynomial m(t) = (£ — 2t + 5)(t — 3)°. Find all possible rational canonical forms M 
of T. 


Because dim V = 7, there are only two possible characteristic polynomials, A; (£) = (£ — 2r+5)* 
(t—3)° or A,(t) = (# —2t+5)(t—3)°. Moreover, the sum of the orders of the companion matrices 
must add up to 7. Also, one companion matrix must be C(t? — 2t + 5) and one must be C((t— 3)°) = 


C(P — 9 + 27t — 27). Thus, M must be one of the following block diagonal matrices: 
r J 0 0 27 
(a) diag : i k a 1 0 -27] J}, 
3 0 1 9 
r J 0 0 27 
(b) diag} |, p 1 0 37), k a ‘ 
E 0 1 9 | 
fo —5] 0 0 27] 
(c) diag 1 2)? 1 0 —-27 ’ [3], [3] 
E 0 1 9 


Projections 
10.33. Suppose V = W, © --- @ W,. The projection of V into its subspace W, is the mapping E: V — V 
defined by E(v) = w;, where v = w, +--- + w,, w; € W;. Show that (a) E is linear, (b) £? = E. 
(a) Because the sum v=w,+---+w,, w; E W is uniquely determined by v, the mapping E is well 
defined. Suppose, for u € V, u = wi +-+- + wi, w, € W;. Then 
v+u = (wi +w) + + (w, +w) and = ku=kw,+---+kw,, kwp w; +w; €W; 


are the unique sums corresponding to v + u and kv. Hence, 
E(v + u) = w, +w, = E(v) + E(u) and E(kv) = kw, + kE(v) 


and therefore £ is linear. 
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10.34. 


10.35. 


(b) We have that 


is the unique sum corresponding to w, € W,; hence, E(w) = wz. Then, for any v € V, 
E? (v) = E(E(w)) = E(w) = we = E(v) 
Thus, E? = E, as required. 


Suppose E:V — V is linear and E* = E. Show that (a) E(u) = u for any u € Im E (i.e., the 
restriction of E to its image is the identity mapping); (b) V is the direct sum of the image and 
kernel of E:V = Im E È Ker E; (c) E is the projection of V into Im £E, its image. Thus, by the 
preceding problem, a linear mapping T:V — V is a projection if and only if T? = T; this 
characterization of a projection is frequently used as its definition. 


(a) If u € Im ZE, then there exists v € V for which E(v) = u; hence, as required, 
E(u) = E(E(v)) = E (v) = E(v) =u 
(b) Let v € V. We can write v in the form v = E(v) + v — E(v). Now E(v) € Im E and, because 
E(v — E(v)) = E(v) — E (v) = E(v) — E(v) = 0 


v — E(v) € Ker E. Accordingly, V = Im E + Ker E. 

Now suppose w € Im EN Ker E. By (i), E(w) =w because w € Im E. On the other hand, 
E(w) = 0 because w € Ker E. Thus, w= 0, and so Im EN Ker E = {0}. These two conditions 
imply that V is the direct sum of the image and kernel of E. 


(c) Let v€ V and suppose v = u + w, where u € Im E and w € Ker E. Note that E(u) = u by (i), and 
E(w) = 0 because w € Ker E. Hence, 


E(v) = E(u + w) = E(u) + E(w) =u+0=u 


That is, E is the projection of V into its image. 


Suppose V = U @ W and suppose T:V — V is linear. Show that U and W are both T-invariant if 
and only if TE = ET, where E is the projection of V into U. 


Observe that E(v) € U for every v € V, and that (i) E(v) = v iff v € U, (ii) E(v) = 0 iff v € W. 
Suppose ET = TE. Let u € U. Because E(u) = u, 


T(u) = T(E(u)) = (TE) (u) = (ET) (u) = E(T(u)) € U 
Hence, U is T-invariant. Now let w € W. Because E(w) = 0, 
E(T(w)) = (ET) (w) = (TE) (w) = T(E(w)) = T(0) = 0, and so T(w) EW 


Hence, W is also T-invariant. 
Conversely, suppose U and W are both T-invariant. Let v € V and suppose v = u + w, where u € T 
and w € W. Then T(u) € U and T(w) € W; hence, E(T(u)) = T(u) and E(T(w)) = 0. Thus, 


(ET)(0) = (ET)(u +w) = (ET)(u) + (ET) (w) = E(T (u) + E(T (w) = T (u) 
and (TE) (v) = (TE) (u + w) = T(E(u + w)) = T(u) 


That is, (ET)(v) = (TE) (v) for every v € V; therefore, ET = TE, as required. 


SUPPLEMENTARY PROBLEMS 


Invariant Subspaces 


10.36. 


10.37. 


Suppose W is invariant under T:V — V. Show that W is invariant under f(T) for any polynomial f(t). 


Show that every subspace of V is invariant under J and 0, the identity and zero operators. 
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10.41. 


10.42. 


. Let W be invariant under T1: V — V and T,:V — V. Prove W is also invariant under T) + T, and T,T). 
. Let T:V — V be linear. Prove that any eigenspace, E, is T-invariant. 


. Let V be a vector space of odd dimension (greater than 1) over the real field R. Show that any linear 


operator on V has an invariant subspace other than V or {0}. 


2 


Determine the invariant subspace of A = | 5 


= viewed as a linear operator on (a) R?, (b) Cc. 


Suppose dim V = n. Show that T:V — V has a triangular matrix representation if and only if there exist 
T-invariant subspaces W, C W, C --- C Wa = V for which dim W, =k, k =1,...,n. 


Invariant Direct Sums 


10.43. 


10.44. 


10.45. 


10.46. 


The subspaces W,,...,W, are said to be independent if w, +---+w,=0, w; E€ W,, implies that each 
w; = 0. Show that span(W,;) = W, ®---@ W, if and only if the W, are independent. [Here span(W;) 
denotes the linear span of the W;.] 

Show that V=W,@---@W, if and only if (i) V =span(W;) and (ii) for k= 1,2,...,r, 
W, NO span(W,,..-, We, Wiai,---,W,) = {0}. 


Show that span(W;) = W, © --- ® W, if and only if dim [span(W,)] = dim W, +---+dim W,. 
Suppose the characteristic polynomial of T:V — V is A(t) = fA (Ah) ---f.(0)"", where the f(t) are 


distinct monic irreducible polynomials. Let V = W, © --- @ W, be the primary decomposition of V into T- 
invariant subspaces. Show that f(t)" is the characteristic polynomial of the restriction of T to W,. 


Nilpotent Operators 


10.47. 


10.48. 


10.49. 


10.50. 


10.51. 


Suppose T, and T, are nilpotent operators that commute (i.e., 7,7, = T,T,). Show that T) + T, and 7,7) 
are also nilpotent. 


Suppose Á is a supertriangular matrix (i.e., all entries on and below the main diagonal are 0). Show that A is 
nilpotent. 


Let V be the vector space of polynomials of degree <n. Show that the derivative operator on V is nilpotent 
of index n + 1. 


Show that any Jordan nilpotent block matrix N is similar to its transpose N7 (the matrix with 1’s below the 
diagonal and 0’s elsewhere). 


Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of 
nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4. 


Jordan Canonical Form 


10.52. 


10.53. 


10.54. 


10.55. 


Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(t) and 
minimal polynomial m(t) are as follows: 


(a) A(t) = (¢=2)"(¢- 3)", mt) = (= 2)°(t- 3), 
©) A(t) =(t—7)°, m(t)=(t-7), © A(X) = (t-2)’, mt) = (t- 2)? 


Show that every complex matrix is similar to its transpose. (Hint: Use its Jordan canonical form.) 
Show that all n x n complex matrices A for which A” = I but A, # I for k < n are similar. 


Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only real 
entries. 
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Cyclic Subspaces 


10.56. Suppose 7:V — V is linear. Prove that Z(v, T) is the intersection of all T-invariant subspaces containing v. 
10.57. Let f(t) and g(t) be the T-annihilators of u and v, respectively. Show that if f (t) and g(t) are relatively 
prime, then f (t)g(t) is the T-annihilator of u + v. 


10.58. Prove that Z(u, T) = Z(v, T) if and only if g(T)(u) = v where g(t) is relatively prime to the T-annihilator of 
u. 


10.59. Let W = Z(v, T), and suppose the T-annihilator of v is f(t)", where f (t) is a monic irreducible polynomial 
of degree d. Show that f(T)*(W) is a cyclic subspace generated by f(T) (v) and that it has dimension 
d(n—s) ifn > s and dimension 0 if n < s. 
Rational Canonical Form 
10.60. Find all possible rational forms for a 6 x 6 matrix over R with minimal polynomial: 
(a) m(t) = (f# —2¢+3)(t+1), ©) m(t) = (t-27. 


10.61. Let A be a4 x 4 matrix with minimal polynomial m(t) = (? + 1)(? — 3). Find the rational canonical form 
for A if A is a matrix over (a) the rational field Q, (b) the real field R, (c) the complex field C. 


10.62. Find the rational canonical form for the four-square Jordan block with 1’s on the diagonal. 
10.63. Prove that the characteristic polynomial of an operator 7:V — V is a product of its elementary divisors. 
10.64. Prove that two 3 x 3 matrices with the same minimal and characteristic polynomials are similar. 
10.65. Let C( f(t)) denote the companion matrix to an arbitrary polynomial f(t). Show that f(t) is the 
characteristic polynomial of C( f (t)). 
Projections 
10.66. Suppose V=W, @---@W,. Let E, denote the projection of V into W;. Prove (i) E;E; =0, i +j; 
(Gi) 7 =£,+---+E,. 
10.67. Let £,,...,£, be linear operators on V such that 
(i) E? = E, (i.e., the E; are projections); (ii) EE, =0,iAj; (i) =F, +--+ tE, 
Prove that V = Im £ © --: @ Im £. 


10.68. Suppose E: V — V is a projection (i.e., E? = E). Prove that E has a matrix representation of the form 


F J , where r is the rank of E and Z, is the r-square identity matrix. 


10.69. Prove that any two projections of the same rank are similar. (Hint: Use the result of Problem 10.68.) 


10.70. Suppose E: V — V is a projection. Prove 

(i) Z — E is a projection and V = Im £ @Im (J — E), (ii) Z + Æ is invertible (if 1 + 1 # 0). 
Quotient Spaces 
10.71. Let W be a subspace of V. Suppose the set of cosets {v, +W, vw +W, ..., v, +W}inV/W is linearly 


independent. Show that the set of vectors {v,,v»,...,v,} in V is also linearly independent. 


10.72. Let W be a substance of V. Suppose the set of vectors {u,,u>,...,u,} in V is linearly independent, and that 
L(u;) N W = {0}. Show that the set of cosets {u,+W, ..., u,+W} in V/W is also linearly 
independent. 
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10.74. 


10.75. 


10.76. 


10.77. 


10.78. 


Suppose V = U W and that {u,,...,u,,} is a basis of U. Show that {u, +W, ..., u, + W} isa basis 
of the quotient spaces V/W. (Observe that no condition is placed on the dimensionality of V or W.) 


Let W be the solution space of the linear equation 
AX + aX ++ F a,x, = O, a; E€ K 


and let v = (b,,5),...,5,) € K”. Prove that the coset v+ W of W in K” is the solution set of the linear 
equation 


aixi + AX. + + + a,x, = D, where b= abi +-+ a,bn 


Let V be the vector space of polynomials over R and let W be the subspace of polynomials divisible by z4 
(i.e., of the form aott + a, +--+ +a,_4t”). Show that the quotient space V/W has dimension 4. 


Let U and W be subspaces of V such that W C U C V. Note that any coset u + W of Win U may also be 
viewed as a coset of W in V, because u € U implies u € V; hence, U/W is a subset of V/W. Prove that 
(i) U/W is a subspace of V/W, (ii) dim(V/W) — dim(U/W) = dim(V/U). 


Let U and W be subspaces of V. Show that the cosets of U N W in V can be obtained by intersecting each of 
the cosets of U in V by each of the cosets of W in V: 


V/(UNW) ={(v t+ U)N(v' + W): 9,0 EV} 


Let T:V — V” be linear with kernel W and image U. Show that the quotient v 

space V/W is isomorphic to U under the mapping 0:V/W — U defined by T 

O(v+ W) = T(v). Furthermore, show that T = io 0o n, where n: V = V/W ” 

is the natural mapping of V into V/W (i.e., n(v) = v + W), and i:U — V” is 

the inclusion mapping (i.e., i(u) = u). (See diagram.) V/W —— U —— V 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


10.41. 


10.52. 


10.60. 


10.61. 


10.62. 


(a) R? and {0}, (b) C’, {0}, W, = span(2, 1-— 2i), W, = span(2, 1+ 2i) 


oP EAE. (Eep 
© a(l 5]. ]7 3]. a) seff 3], tr m m): 


(c) Let M, denote a Jordan block with 4 = 2 and order k. Then diag(M3,M3,M,), diag(M3,M,, M3), 
diag(M;,M,,M,,M,), diag(M3,M,,M,,M,,M,) 


0 0 8 
i E. s=[ e C=|1 0 -12], eji E! 
0 1 6 
(a) diag(A, A, B), diag(A, B, B), diag(A, B, —1, —1); (b) diag(C, C), diag(C, D, 2), diag(C, 2, 2, 2) 
0 -=l 0 3 
i ob 2=fi ol 


(a) diag(A,B), (b) diag(4, V3, = v3), (c) diag(i, —i, V3, — v3) 


Let A = 


Companion matrix with the last column [=a 44°, =p)", 41)" 


Linear Functionals 
and the Dual Space 


11.1 Introduction 


In this chapter, we study linear mappings from a vector space V into its field K of scalars. (Unless 
otherwise stated or implied, we view K as a vector space over itself.) Naturally all the theorems and 
results for arbitrary mappings on V hold for this special case. However, we treat these mappings 
separately because of their fundamental importance and because the special relationship of V to K gives 
rise to new notions and results that do not apply in the general case. 


11.2 Linear Functionals and the Dual Space 


Let V be a vector space over a field K. A mapping ¢:V — K is termed a linear functional (or linear form) 
if, for every u,v € V and every a,b, € K, 


p(au + bv) = a (u) + bo(v) 
In other words, a linear functional on V is a linear mapping from V into K. 
EXAMPLE 11.1 


(a) Let 2;:K” — K be the ith projection mapping; that is, 2;(a,,a),...a,,) = a;. Then 7; is linear and so it is a linear 
functional on K”. 


(b) Let V be the vector space of polynomials in t over R. Let J:V — R be the integral operator defined by 
J(p(t)) = Jy p(t) dt. Recall that J is linear; and hence, it is a linear functional on V. 


(c) Let V be the vector space of m-square matrices over K. Let T:V — K be the trace mapping 
T(A) = ayy + dy) + +++ + am, where A= [a;] 


That is, T assigns to a matrix A the sum of its diagonal elements. This map is linear (Problem 11.24), and so it is 
a linear functional on V. 


By Theorem 5.10, the set of linear functionals on a vector space V over a field Ķ is also a vector 
space over K, with addition and scalar multiplication defined by 


(P+ a)(v) =P(v)+o(v) and (kh) (v) = k(v) 


where ¢ and o are linear functionals on V and k € K. This space is called the dual space of V and is 
denoted by V*. 


EXAMPLE 11.2 Let V = K”, the vector space of n-tuples, which we write as column vectors. Then the dual space V* can 
be identified with the space of row vectors. In particular, any linear functional = (a,,...,a,) in V* has the representation 


D(X 5X5 «6 «4X p) = lai, a, . -o Dy] [XoXo ++ apn] = AyXy + yxy +--- +.4,x, 


Historically, the formal expression on the right was termed a linear form. 


—— e 
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11.3 Dual Basis 


Suppose V is a vector space of dimension n over K. By Theorem 5.11, the dimension of the dual space V* 
is also n (because K is of dimension 1 over itself). In fact, each basis of V determines a basis of V* as 
follows (see Problem 11.3 for the proof). 


THEOREM 11.1: Suppose {v,,...,v,} is a basis of V over K. Let ¢,,...,6, E V* be the linear 
functionals as defined by 


je f1 ifi=j 
$y) =8y= 44 if i#j 
Then {¢),...,¢,} is a basis of V*. 


The above basis {¢ġ;} is termed the basis dual to {v;} or the dual basis. The above formula, which 
uses the Kronecker delta ô;;, is a short way of writing 
pilv) =I, (v2) =0, 1 (v3) = 0, +s., Pilv) =0 
bo(%) =0, b2(%) =1, b2(03) =0, ..., P(n) =0 


Pnl) = 0, bn( U2) = 0, ace Pn(Un—1) = 0, Pn(Vn) =1 
By Theorem 5.2, these linear mappings ġ; are unique and well defined. 


EXAMPLE 11.3 Consider the basis {v, = (2,1), v, = (3,1)} of R?. Find the dual basis {¢,, 65}. 
We seek linear functionals h; (x,y) = ax + by and (x,y) = cx + dy such that 


$1(%) =], pı (v2) =0, (2) = 0, (v2) =1 

These four conditions lead to the following two systems of linear equations: 
ae ene and oa a ee 
bi (v2) = (3,1) =3a+b=0 b2(%) = $2(3, 1) =3e+d=1 


The solutions yield a = —1, b = 3 and c = 1, d = —2. Hence, h; (x,y) = —x + 3y and (x,y) = x — 2y form the 
dual basis. 


The next two theorems (proved in Problems 11.4 and 11.5, respectively) give relationships between 
bases and their duals. 


THEOREM 11.2: Let {v,,..., v,} be a basis of V and let {¢,,...,@,,} be the dual basis in V*. Then 


(i) For any vector u € V, u = ġ (u)v, + p (u)v +--+: + p, (u)v, 
(ii) For any linear functional o € V*, o = 0(v,), + olv)Pa +--+ alv) Pn 


THEOREM 11.3: Let {v,,...,u,} and {w,,...,w,} be bases of V and let {¢,,...,¢,} and 
{o,,..-,0,} be the bases of V* dual to {v;} and {w;}, a sabe Suppose P is 
the change-of-basis matrix from {v,;} to {w;}. Then (P is the change-of-basis 
matrix from {¢;} to {0;}. 


11.4 Second Dual Space 


We repeat: Every vector space V has a dual space V*, which consists of all the linear functionals on V. 
Thus, V* has a dual space V**, called the second dual of V, which consists of all the linear functionals 
on V*, 

We now show that each v € V determines a specific element i € V**. First, for any @ € V*, we define 


lp) = P(r) 
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It remains to be shown that this map 7:V* — K is linear. For any scalars a,b € K and any linear 
functionals ġ,o € V*, we have 


lap + ba) = (ap + ba)(v) = ah(v) + bo(v) = a8($) + bêlo) 


That is, ô is linear and so ô € V**. The following theorem (proved in Problem 12.7) holds. 


THEOREM 11.4: If V has finite dimensions, then the mapping v'> @ is an isomorphism of V 
onto V**, 


The above mapping v> @ is called the natural mapping of V into V**. We emphasize that this 
mapping is never onto V** if V is not finite-dimensional. However, it is always linear, and moreover, it is 
always one-to-one. 

Now suppose V does have finite dimension. By Theorem 11.4, the natural mapping determines an 
isomorphism between V and V**. Unless otherwise stated, we will identify V with V** by this 
mapping. Accordingly, we will view V as the space of linear functionals on V* and write V = V**. We 
remark that if {;} is the basis of V* dual to a basis {v;} of V, then {v;} is the basis of V** = V that is 
dual to {@;}. 


11.5 Annihilators 


Let W be a subset (not necessarily a subspace) of a vector space V. A linear functional @ € V* is called 
an annihilator of W if ¢(w) = 0 for every w € W—that is, if 6(W) = {0}. We show that the set of all 
such mappings, denoted by W°? and called the annihilator of W, is a subspace of V*. Clearly, 0 € W°. 
Now suppose ¢,¢ € W°. Then, for any scalars a,b, € K and for any w € W, 


(ab + ba)(w) = ad(w) + ba(w) = a0 + b0 = 0 


Thus, ad + ba € W®, and so W° is a subspace of V*. 
In the case that W is a subspace of V, we have the following relationship between W and its annihilator 
W? (see Problem 11.11 for the proof). 


THEOREM 11.5: Suppose V has finite dimension and W is a subspace of V. Then 
(i) dimW+dimW°=dimV and (ii) W =W 


Here W = {v € V:6(v) = 0 for every o € W°} or, equivalently, W = (W°)°, where W is viewed 
as a subspace of V under the identification of V and V**. 


11.6 Transpose of a Linear Mapping 


Let T:V — U be an arbitrary linear mapping from a vector space V into a vector space U. Now for any 
linear functional o € U*, the composition ġ ° T is a linear mapping from V into K: 


T o 
V U—~ K 


That is, 6 o T € V*. Thus, the correspondence 
$= oT 


is a mapping from U* into V*; we denote it by 7’ and call it the transpose of T. In other words, 
T':U* — V* is defined by 


1"(b) = $e 
Thus, (7"())(v) = $(T(v)) for every v € V. 


ERD CHAPTER 11 Linear Functionals and the Dual Space 


THEOREM 11.6: The transpose mapping T* defined above is linear. 


Proof. For any scalars a,b € K and any linear functionals ¢,0 € U*, 
T' (ab + bo) = (ab + ba)° T =al ° T) +b(o° T) = aT (p) + bT (0) 
That is, 7’ is linear, as claimed. 
We emphasize that if T is a linear mapping from V into U, then 7" is a linear mapping from U* into 


V*. The same ‘‘transpose’’ for the mapping 7’ no doubt derives from the following theorem (proved in 
Problem 11.16). 


THEOREM 11.7: Let T:V — U be linear, and let A be the matrix representation of T relative to bases 
{v;} of V and {u;} of U. Then the transpose matrix A” is the matrix representation of 
T':U* — V* relative to the bases dual to {u;} and {v;}. 


SOLVED PROBLEMS 


Dual Spaces and Dual Bases 


11.1. Find the basis {f,, 5,3} that is dual to the following basis of R°: 
{v = (1, -1,3), Vj = (0, 1,-1), V3 = (0,3, —2)} 


The linear functionals may be expressed in the form 
by (x,¥,Z) = a,x + any + agz, p(x, y, Z) = bx + boy + bz, P(x,y, Z) = c1X + Coy + 032 


By definition of the dual basis, #;(v;) = 0 for i # j, but $;(v;) = 1 for i =j. 
We find ¢, by setting $,(v,) = 1, (v) =0, o,(v;) =0. This yields 


¢$,(1,-1,3) =a, — a, + 3a; = 1, ,(0, 1, -1) = a — a; = 0, pı (0,3, —2) = 3a, — 2a; = 0 


Solving the system of equations yields a, = 1, a, = 0, a; = 0. Thus, ¢, (x,y,z) = x. 
We find ġ, by setting (v1) =0, (v) =1, ¢)(v3) =0. This yields 


(1, -1,3) = bı — by + 3b; = 0, ,(0,1,—-1) = b, — b; = 1, ,(0,3, —2) = 3b, — 2b, = 0 


Solving the system of equations yields b} = 7, b} = —2, a; = —3. Thus, h (x,y,z) = 7x — 2y — 3z. 
We find 3 by setting $3(v,) =0, $3(v) =0, $3(v3) = 1. This yields 


3(1, -1,3) = cı — cp + 363 = 0, $3(0,1,-1) =c, — c = 0, 3(0,3, —2) = 3c, — 2c3 = 1 


Solving the system of equations yields c} = —2, cy = 1, c3 = 1. Thus, 3(x,y,z) = —2x + y +z. 


11.2. Let V = {a+ bt:a,b € R}, the vector space of real polynomials of degree <1. Find the basis 
{v,, v2} of V that is dual to the basis {f,, 6} of V* defined by 


Let v; = a + bt and v, = c+ dt. By definition of the dual basis, 
(vı) = 1, pı) =0 and (%1) = 9, AQA) =1 


Thus, 

= fi (a + bt) al ma ol Pk eae 

= fi (a+ bt) dt =2a+2b=0 fo(c+dt) dt =2c+2d=1 
Solving each system yields a= 2, b= —2 et d = 1. Thus, {v, =2— 2t, v, =—4+ t} is 


the basis of V that is dual to {¢,, b>}. 
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11.3. 


11.4. 


11.5. 


Prove Theorem 11.1: Suppose {v,,...,v,} is a basis of V over K. Let ¢,...,6, E V* be 
defined by ġ;(v;) = 0 for i # j, but ,(v;) = 1 for i = j. Then {ġ;,..., @„} is a basis of V*. 


We first show that {9}, ..., @„} spans V*. Let @ be an arbitrary element of V*, and suppose 
plv) = ky, (v2) = ky, poe) P( Uy) = kp 
Set o =k,¢, +---+k,@,. Then 


ao(v) = (kipi +++ + ky bn) (M1) = kihil) + kplu) +++ + bn (21) 
=k -14h-04--4k, 0= k 


Similarly, for i = 2,...,n, 
oa(u) = (kipi + + nbn) (0) = ky (Uj) +++ + kplu) +++ + knpn) = Ki 
Thus, (v;)=o(v;) for i=1,...,n. Because m and o agree on the basis vectors, 
p =c = khi +--+ kQ, Accordingly, {p,,...,¢,} spans V*. 
It remains to be shown that {¢,,...,¢,,} is linearly independent. Suppose 


aip + anh, enn +a, P, =0 
Applying both sides to v,, we obtain 


0 = 0(r) = (ahi +--+ + a, b,) (V1) = a6) (01) + anGo(V) + +++ + ang, (1) 
=a,;:l+a,-0+---+a,:0=a, 


Similarly, for i = 2,...,n, 
0 = O(u;) = (a,b, + +++ + andy) (0i) = abi (v) +++ + apil) +--+ + an by( Ui) = a; 
That is, a; = 0,...,a, = 0. Hence, {¢,,...,,} is linearly independent, and so it is a basis of V*. 


Prove Theorem 11.2: Let {v,,...,v,} be a basis of V and let {¢,,...,¢,} be the dual basis in 
V*. For any u € V and any o €V*, (i) u=)>0;¢,(u)u,. (ii) o = 0; o(4,); 
Suppose 
U= 4,0, +4), +--+: +4,0, (1) 
Then 
(u) = a1 b1(%) + anh (%) +--+ ap (n) = 4-1 ta,-O4---+4,-0= 4, 
Similarly, for i = 2,...,n, 
bi(u) = api (v1) ++ + ahi) Fo + an bi(Un) = a; 
That is, b, (u) = a), o)(u) = a,...,b,(u) =a,. Substituting these results into (1), we obtain (i). 
Next we prove (ii). Applying the linear functional ø to both sides of (i), 
o(u) = by (u)o(v1) + p2(u)o(v2) +--+ Pn(u)o(un) 
= (0) (u) + o(v2)bo(u) + +++ + o(v,)P, (u) 
(o(v) Py + o( U2) by + +> + (Un) bn) (u) 


Because the above holds for every u € V, o = o(v,)¢, + olv) +--+: + 0(u,)¢,, as claimed. 


Prove Theorem 11.3. Let {v;} and {w,} be bases of V and let {ġ;} and {o;} be the n ective 
dual bases in V*. Let P be the change-of-basis matrix from {v;} to {w;}. Then (P~')” is the 
change-of-basis matrix from {@,} to {a;}. 


Suppose, fori =1,...,n, 
Wi = Qj Vy + jy V2 H+ H Gin Uy and =o; = bapi + dinghy + +++ + Gin Uy, 
Then P = [a;;] and Q = [b,]. We seek to prove that Q = (P7 a2 


Let R; denote the ith row of Q and let C; denote the jth column of PT. Then 


T 
R; = = (ba, ba,- bin) and C = (dji, a2,- -3 Qn) 
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11.6. 


11.7. 


By definition of the dual basis, 
oi(w;) = (bapi + bohr + +++ + Dinbn) (aav + apv + +++ + Anty) 
= baa; + bnan + +++ + binan = RiC; = 6, 
where Ò;j is the Kronecker delta. Thus, 
OP" = [Ri] = [ð;] = 7 
Therefore, Q = (P7)"' = (P~!)", as claimed. 


Suppose v € V, v Æ 0, and dim V = n. Show that there exists œ € V* such that (v) 4 0. 


We extend {v} to a basis {v, v, .. . , v,} of V. By Theorem 5.2, there exists a unique linear mapping 
o:V — K such that (v) = 1 and ¢(v;) = 0, i= 2,...,n. Hence, ¢ has the desired property. 


Prove Theorem 11.4: Suppose dim V = n. Then the natural mapping v+> ĉis an isomorphism of 
V onto V**, 


__We first prove that the map v +> 0 is linear—that is, for any vectors v, w € V and any scalars a,b € K, 
av + bw = aî + bw. For any linear functional ¢ € V*, 


av + bw($) = plav + bw) = ad(v) + bb(w) = aô(p) + bi() = (ad + bw) (¢) 


Because av + bw() = (aô + bw)(#) for every p € V*, we have av + bw = ad + bw. Thus, the map 
ut 0 is linear. 

Now suppose v € V, v 4 0. Then, by Problem 11.6, there exists ¢ € V* for which (v) 4 0. Hence, 
i(p) = (v) # 0, and thus i 4 0. Because v Æ 0 implies ô + 0, the map v + ĉ is nonsingular and hence 
an isomorphism (Theorem 5.64). 

Now dim V = dim V* = dim V**, because V has finite dimension. Accordingly, the mapping v +> 6 
is an isomorphism of V onto V**. 


Annihilators 


11.8. 


11.9. 


11.10. 


Show that if p € V* annihilates a subset S of V, then ¢ annihilates the linear span L(S) of S. 
Hence, S° = [span(S)]°. 
Suppose v € span(S). Then there exists w,,...,w, E€ S for which v = aw; + aw +--+ +4,w,. 
b(v) = a $(w1) +a p(w) +--+ +a p(w) = a0 + 4,0+---+a,0 = 0 


Because v was an arbitrary element of span(S), @ annihilates span(S), as claimed. 


Find a basis of the annihilator W° of the subspace W of R* spanned by 
v, = (1,2,-3,4) and v) = (0,1,4,-1) 


By Problem 11.8, it suffices to find a basis of the set of linear functionals @ such that ¢(v,) = 0 and 
(v2) = 0, where (x1, x2,X3,X4) = ax, + bx, + cx; + dx4. Thus, 


(1,2, -3,4) =a+2b-—3c+4d=0 and $(0,1,4,-1) =b+4c-—d=0 
The system of two equations in the unknowns a, b,c,d is in echelon form with free variables c and d. 


(1) Set c= 1, d=0 to obtain the solution a = 11, b = —4, c = 1, d = 0. 
(2) Set c= 0, d = 1 to obtain the solution a = 6, b = —1, c = 0, d = 1. 


The linear functions #,(x;) = 11x; — 4x. + x3 and ,(x;) = 6x; — x2 + x4 form a basis of W°. 
Show that (a) For any subset S of V,S C S%. (b) If S, CS), then S} C S}. 


(a) Let v € S. Then for every linear functional € S°, i() = p(v) = 0. Hence, 6 € (s°)°. Therefore, 
under the identification of V and V**, v € S%. Accordingly, S C S$. 

(b) Let $ € S$. Then (v) = 0 for every v € Sz. But S, C S5; hence, h annihilates every element of S4 
(i.e., p € S®). Therefore, S} C S°. 
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11.11. Prove Theorem 11.5: Suppose V has finite dimension and W is a subspace of V. Then 


11.12. 


(i) dim W + dim W° = dim V, (ii) W® = W. 


(i) Suppose dim V = n and dim W =r < n. We want to show that dim W° = n — r. We choose a basis 
{w,,---,w,} of W and extend it to a basis of V, say {w,,...,W,,U,,---,U,_,}- Consider the dual 
basis 


Pies Pp yy +2450 Onr} 


By definition of the dual basis, each of the above o’s annihilates each w;; hence, o,,...,0,_, € W°. 
We claim that {c;} is a basis of W°. Now {o,} is part of a basis of V*, and so it is linearly 
independent. 


We next show that {@,} spans W?. Let o € W?. By Theorem 11.2, 
= a(w), ap S52 a(w,), + a(v;)o; see aes | ae 
= 0¢, a ladon 06, E3 ao(vi)o aeae ae ae 
= a(v;)o; ash O( Up) Oye 
Consequently, {o,,...,¢,_,} spans W° and so it is a basis of W°. Accordingly, as required 


dim W? = n — r = dim V — dim W. 


(ii) Suppose dim V = n and dim W = r. Then dim V* = n and, by (i), dim W? = n — r. Thus, by (i), 
dim W = n — (n—r) =r; therefore, dim W = dim W%. By Problem 11.10, W C W%. Accord- 
ingly, W = Ww. 


Let U and W be subspaces of V. Prove that (U + wy =U’ np. 


Let p E (U + w). Then ġ annihilates U + W, and so, in particular, @ annihilates U and W. That is, 
Q € U and ġ € W®; hence, 6 € U? N W°. Thus, (U + W)? C Un W°. 

On the other hand, suppose o € U? N W°. Then o annihilates U and also W. If v € U + W, then 
v = u + w, where u € U and w € W. Hence, o(v) = o(u) + olw ) = 0 + 0 = 0. Thus, o annihilates U + W; 
that is, o € (U + W)°. Accordingly, U? + W? C (U + WY. 

The two inclusion relations together give us the desired equality. 


Remark: Observe that no dimension argument is employed in the proof, hence, the result holds for 
spaces of finite or infinite dimension. 


Transpose of a Linear Mapping 


11.13. 


11.14. 


Let ¢ be the linear functional on R? defined by (x,y) = x — 2y. For each of the following linear 
operators T on R’, find (7'(@)) (x,y): 
(a) T(x,y) = (x,0), b) Tay) =(%, x+y), © T(x, y) = (2x — 3y, 5x + 2y) 

By definition, 7’(¢) = ¢o T; that is, (7’(%))(v) = ¢(T(v)) for every v. Hence, 
(a) (T'($))(%,¥) = O(T(x,¥)) = kx, | =x 
(b) (T'(¢))@y) = PTY) = 0, x+y) =y- 2x +y) = —2x—-y 
(c) (T'($))(,¥) = O(T(x,¥)) = b(2x — a 5x + 2y) = (2x — 3y) — 2(5x + 2y) = —8x — Ty 


T(x, 


Let T:V — U be linear and let 7’:U* — V* be its ea Show that the kernel of T“ is the 
annihilator of the image of 7—that is, Ker 7’ = (Im T) 
Suppose ġ € Ker 7"; that is, T'(¢) = p° T = 0. If u € Im T, then u = T(v) for some v € V; hence, 
plu) = P(T(v)) = ($° T)(v) = 0v) = 
We have that ¢(u) = 0 for every u € Im T; pence, $ € (Im T)’. Thus, Ker 7’ C (Im T)’. 
On the other hand, suppose c € (Im T)°; that is, o(Im T) = {0} . Then, for every v € V, 


(T(o))(v) = (e° T)(v) = a(T(v)) = 0 = 0(v) 
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11.15. 


11.16. 


We have (T'(o))(v)=0(v) for every ve V; hence, T’(c)=0. Thus, c€ Ker T‘, and so 
(Im T)’ C Ker T". 
The two inclusion relations together give us the required equality. 


Suppose V and U have finite dimension and T:V — U is linear. Prove rank(T) = rank(T*). 
Suppose dim V = n and dim U = m, and suppose rank(T) = r. By Theorem 11.5, 


dim(Im T)? = dim u — dim(Im T) = m — rank(T) = m — r 
By Problem 11.14, Ker 7’ = (Im T Hence, nullity (7’) = m — r. It then follows that, as claimed, 
rank(7") = dim U* — nullity(7") = m — (m — r) = r = rank(T) 


Prove Theorem 11.7: Let T:V — U be linear and let A be the matrix representation of T in the 
bases {v,} of V and {u;} of U. Then the transpose matrix A’ is the matrix representation of 
T':U* — V* in the bases dual to {u;} and {v;}. 
Suppose, for j = 1,...,m, 
T(v;) = aju] + apt +: + ajin (1) 


We want to prove that, fori=1,...,n, 
T'(a;) = aih ga azh pirre AmiPm (2) 
where {c;} and {@,} are the bases dual to {u;} and {v;}, respectively. 
Let v € V and suppose v = kvi + kat + --- +k,,v,,. Then, by (1), 
T(v) = k,T(v) + kT (v) apne kmT (Um) 
= kı (au; qang Ainun) oa ky (au pesa Anin) passz Kin (Ami M1 qre Amnn) 
= (kiaj + kaz Poesi kin Qn Uy pirsa (kiain H koan pasi kmamn )Un 


= D (kiai; + ky a9; as Kin mi Uj 
i=1 
Hence, for j = 1,...,n. 


(T'(a;)(v)) =06,(T(v)) = (Soa + kydgy +++ + nt) ) 


i=1 


= kay + kya +++ + hi Qn, (3) 
On the other hand, for j = 1,...,n, 


(ayo: =F Ary a An Pm) (V) = (a); + Ary qrg An Pin) (kiv + kv aegea kn Um) 
a kyai; F khay; Pac Kinny (4) 


Because v € V was arbitrary, (3) and (4) imply that 
T'(G;) = aybi + ay +: + ng Pns J=1,-..,n 


which is (2). Thus, the theorem is proved. 


SUPPLEMENTARY PROBLEMS 


Dual Spaces and Dual Bases 


11.17. 


11.18. 


Find (a) p +0, (b) 3, (c) 2ġ — 50, where p:R? — R and o:R? — R are defined by 


(x,y,z) = 2x — 3y +z and o(x,y,z) = 4x — 2y + 3z 


Find the dual basis of each of the following bases of R°: (a) {(1,0,0), (0,1,0), (0,0, 1)}, 
(b) {(1,-2,3), (1,-1,1), (2,-4,7)}. 
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11.19. 


11.20. 


11.21. 


11.22. 


11.23. 


11.24. 


11.25. 


11.26. 


11.27. 


Let V be the vector space of polynomials over R of degree <2. Let 6,, Q2, Q3 be the linear functionals on 
V defined by 


Here f(t) = a + bt + ct? € V and f'(t) denotes the derivative of f(t). Find the basis { fi (4), hA, AA} of 
V that is dual to {,, 62, 3}. 


Suppose u, v € V and that ġ(u) = 0 implies (v) = 0 for all p € V*. Show that v = ku for some scalar k. 
Suppose ġ,o € V* and that ¢(v) = 0 implies o(v) = 0 for all v € V. Show that o = kọ for some scalar k. 


Let V be the vector space of polynomials over K. For a € K, define 6,:V — K by ¢,( f(t)) =f(a). Show 
that (a) @, is linear; (b) if a £ b, then ġ, £ oy. 


Let V be the vector space of polynomials of degree <2. Let a,b,c € K be distinct scalars. Let 6,, Qp, Qe 
be the linear functionals defined by ¢,(f(t)) =f(a), PASA) =b), 6.(/()) =f(c). Show that 
{¢,; by, Pe} is linearly independent, and find the basis { f((4),45(1),4()} of V that is its dual. 


Let V be the vector space of square matrices of order n. Let T:V — K be the trace mapping; that is, 
T(A) = ay + an +++ + ann, Where A = (a,;). Show that T is linear. 


Let W be a subspace of V. For any linear functional @ on W, show that there is a linear functional o on V 
such that o(w) = ¢(w) for any w € W; that is, @ is the restriction of ø to W. 


Let {e,,...,¢,} be the usual basis of K”. Show that the dual basis is {7,,...,2,} where z; is the ith 
projection mapping; that is, 7,;(a,,...,a,) = qj. 
Let V be a vector space over R. Let $, € V* and suppose o:V — R, defined by o(v) = ¢;(v)2(v), 


also belongs to V*. Show that either 6, = 0 or ġ, = 0. 


Annihilators 


11.28. 


11.29. 


11.30. 


11.31. 


11.32. 


Let W be the subspace of Rê spanned by (1,2,—3,4), (1,3,—2,6), (1,4,—1,8). Find a basis of the 
annihilator of W. 


Let W be the subspace of R? spanned by (1, 1,0) and (0, 1,1). Find a basis of the annihilator of W. 
Show that, for any subset S of V,span(S) = S°, where span(S) is the linear span of S. 
Let U and W be subspaces of a vector space V of finite dimension. Prove that (UM W)? = U? + W°. 


Suppose V = U @ W. Prove that V? = U? ẹ W°. 


Transpose of a Linear Mapping 


11.33. 


11.34. 


11.35. 


Let @ be the linear functional on R? defined by (x,y) = 3x — 2y. For each of the following linear 
mappings T:R? — R?, find (T"(#))(x,y,z): 


(a) T(x, y,z) = (x+y, ytz), (b) T (x,y,z) = (x +y +z, 2x — y) 
Suppose 7,:U — V and T,:V — W are linear. Prove that (T, ° T,)' = Tio T3. 


Suppose 7:V — U is linear and V has finite dimension. Prove that Im 7’ = (Ker T we 
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11.36. Suppose T:V — U is linear and u € U. Prove that u € Im T or there exists ¢ € V* such that T'(ġ) = 0 
and o(u) = 1. 


11.37. Let V be of finite dimension. Show that the mapping T +> 7" is an isomorphism from Hom(V, V) onto 
Hom(V*, V*). (Here T is any linear operator on V.) 


Miscellaneous Problems 


11.38. Let V be a vector space over R. The line segment uv joining points u,v € V is defined by 
i= + (1 — t)v:0 < t < 1}. A subset S of V is convex if u, v € S implies 70 C S. Let p € V*. Define 


={vE V: ġ(v)>0}, W = {vE V: ġ(v)=0}, W- ={vEV: (v) < 0} 


Prove that W*, W, and W~ are convex. 


11.39. Let V bea vector space of finite dimension. A hyperplane H of V may be defined as the kernel of a nonzero 
linear functional ¢ on V. Show that every subspace of V is the intersection of a finite number of 
hyperplanes. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


11.17. (a) 6x-— 5y + 4z, (b) 6x — 9y + 3z, (c) —16x + 4y -— 13z 


11.18. (a) ¢,=x, %, =y, $3 =z; (b) Qı =—3x —Sy— 2z, , =2x+y, $3 =x+2y+z 
11.19. f(A =3t-3P, hA) =—4t+iP, hA =1-3t+r 


11.22. (b) Let a = t. Then ¢,(f(t)) =a # b= h,(f(t)); and therefore, 6, Æ Pp 


— (b + c)t + be ? —(a+c)t+ac _ Ê —(a+b)t+ab 
a {ato =" @—pa— >!" p-a- 2 -“G—ae—a) } 


11.28. {¢,(%,y,z,t) = 5x—y +z, p (x,y,z, t) = 2y— t} 


11.29. {¢(x,y,z) =x—y+z} 


11.33. (a) (T'(#))(x,y,z) = 3x +y— 2z, b) (T) (x,y,z) = —x + 5y + 3z 


Bilinear, Quadratic, 
and Hermitian Forms 


12.1 Introduction 


This chapter generalizes the notions of linear mappings and linear functionals. Specifically, we introduce 
the notion of a bilinear form. These bilinear maps also give rise to quadratic and Hermitian forms. 
Although quadratic forms were discussed previously, this chapter is treated independently of the previous 
results. 

Although the field K is arbitrary, we will later specialize to the cases K = R and K = C. Furthermore, 
we may sometimes need to divide by 2. In such cases, we must assume that 1 + 1 4 0, which is true when 
K=RorK=C. 


12.2 Bilinear Forms 


Let V be a vector space of finite dimension over a field K. A bilinear form on V is a mapping 
f:V x V — K such that, for all a,b € K and all u, v; € V: 


(i) f(a + buy, v) = af (u,v) + bf (up, v), 
(i) f(u, av, + bv) = af (u, vi) + bf (u, v) 


We express condition (i) by saying f is linear in the first variable, and condition (ii) by saying f is linear 
in the second variable. 


EXAMPLE 12.1 
(a) Let f be the dot product on R”; that is, for u = (a;) and v = (b,), 


f(u, v) =u: v= abı + ab +--+ + a,b, 


Then f is a bilinear form on R”. (In fact, any inner product on a real vector space V is a bilinear form 
on F.) 


(b) Let ġ and o be arbitrarily linear functionals on V. Let f :V x V — K be defined by f (u, v) = ọ(u)a(v). Then f is 
a bilinear form, because ¢ and o are each linear. 


(c) Let A = [a;] be any n x n matrix over a field K. Then A may be identified with the following bilinear form F on 
K”, where X = [x;] and Y = |y,] are column vectors of variables: 


f(X, Y) = X"AY = DO AX Vi = A11X1V1 + 41X V2 +++ + Apn XnYn 
ij 


The above formal expression in the variables x;, y; is termed the bilinear polynomial corresponding to the matrix 
A. Equation (12.1) shows that, in a certain sense, every bilinear form is of this type. 


—_<«=> 
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Space of Bilinear Forms 


Let B(V) denote the set of all bilinear forms on V. A vector space structure is placed on B(V), where for 
any f,g € B(V) and any k € K, we define f + g and kf as follows: 


(f+g)(u,v) =f(u,v)+g(u,v) and — (Af )(u, v) = kf (u, v) 
The following theorem (proved in Problem 12.4) applies. 
THEOREM 12.1: Let V be a vector space of dimension n over K. Let {,,...,@,,} be any basis of the 


dual space V*. Then { fj :i,j = 1,...,} is a basis of B(V), where f is defined by 
f(u, v) = b(u)p;(v). Thus, in particular, dim B(V) = n’. 


12.3 Bilinear Forms and Matrices 


Let f be a bilinear form on V and let S = {u,,...,u,} be a basis of V. Suppose u, v € V and 
U = ayt +++: F apun and v = byu +: + bpn 
Then 
flu, 0) = flay e iay b u a (ays) 
ij 


Thus, f is completely determined by the n? values f (u;, u). 


The matrix A = [a;] where a; = f (u;, u;) is called the matrix representation of f relative to the basis S 


or, simply, the ‘‘matrix of f in S.” It “represents” f in the sense that, for all u, v € V, 
f(u, v) = > a;b; f (u; uj) = [ule s (12.1) 
ij 
[As usual, [u]; denotes the coordinate (column) vector of u in the basis S.] 


Change of Basis, Congruent Matrices 


We now ask, how does a matrix representing a bilinear form transform when a new basis is selected? The 
answer is given in the following theorem (proved in Problem 12.5). 


THEOREM 12.2: Let P be a change-of-basis matrix from one basis S to another basis S”. If A is the 
matrix representing a bilinear form f in the original basis S, then B = P’ AP is the 
matrix representing f in the new basis S’. 


The above theorem motivates the following definition. 


DEFINITION: A matrix B is congruent to a matrix A, written B œ A, if there exists a nonsingular 
matrix P such that B = PTAP. 


Thus, by Theorem 12.2, matrices representing the same bilinear form are congruent. We remark that 
congruent matrices have the same rank, because P and P” are nonsingular; hence, the following definition 
is well defined. 


DEFINITION: The rank of a bilinear form f on V, written rank( f), is the rank of any matrix 


representation of f. We say f is degenerate or nondegenerate according to whether 
rank( f) < dim V or rank( f) = dim V. 


12.4 Alternating Bilinear Forms 


Let f be a bilinear form on V. Then f is called 
(i) alternating if f (v, v) = 0 for every v € V; 
(ii) skew-symmetric if f (u, v) = —f(v,u) for every u, v € V. 
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Now suppose (i) is true. Then (ii) is true, because, for any u, v, € V, 
O=f(utv, utv)=f(u,u) +flu, v) +f(v, 4) +f(v, v) =f (u, v) +f(v, 4) 


On the other hand, suppose (ii) is true and also 1 + 1 4 0. Then (i) is true, because, for every v € V, we 
have f (v, v) = —f (v, v). In other words, alternating and skew-symmetric are equivalent when 1 + 1 Æ 0. 


The main structure theorem of alternating bilinear forms (proved in Problem 12.23) is as follows. 


THEOREM 12.3: Letf be an alternating bilinear form on V. Then there exists a basis of V in which f 
is represented by a block diagonal matrix M of the form 


M =diag(|_ AE E a M & A io}, [0], ... (0) 


Moreover, the number of nonzero blocks is uniquely determined by f [because it is 
equal to 4 rank( f)]. 


In particular, the above theorem shows that any alternating bilinear form must have even rank. 


12.5 Symmetric Bilinear Forms, Quadratic Forms 


This section investigates the important notions of symmetric bilinear forms and quadratic forms and their 
representation by means of symmetric matrices. The only restriction on the field K is that 1 + 1 4 0. In 
Section 12.6, we will restrict K to be the real field R, which yields important special results. 


Symmetric Bilinear Forms 


Let f be a bilinear form on V. Then f is said to be symmetric if, for every u, v € V, 


f(u, v) =f (v, u) 
One can easily show that f is symmetric if and only if any matrix representation A of f is a symmetric 
matrix. 
The main result for symmetric bilinear forms (proved in Problem 12.10) is as follows. (We emphasize 
that we are assuming that 1 + 1 4 0.) 


THEOREM 12.4: Let f be a symmetric bilinear form on V. Then V has a basis {v,,..., v, } in which f 
is represented by a diagonal matrix—that is, where f (v;, v;) = 0 for i £ j. 


THEOREM 12.4: (Alternative Form) Let A be a symmetric matrix over K. Then A is congruent to a 
diagonal matrix; that is, there exists a nonsingular matrix P such that PTAP is 
diagonal. 


Diagonalization Algorithm 


Recall that a nonsingular matrix P is a product of elementary matrices. Accordingly, one way of 
obtaining the diagonal form D = PTAP is by a sequence of elementary row operations and the same 
sequence of elementary column operations. This same sequence of elementary row operations on the 
identity matrix J will yield P’. This algorithm is formalized below. 


ALGORITHM 12.1: | (Congruence Diagonalization of a Symmetric Matrix) The input is a symmetric 
matrix A = [a,] of order n. 


Step 1. Form then x 2n (block) matrix M = [A,,/], where A, = A is the left half of M and the identity 
matrix J is the right half of M. 


Step 2. Examine the entry a,,. There are three cases. 
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Case I: a # 0. (Use ay, as a pivot to put 0’s below a, in M and to the right of a}; in 44.) 
For i = 2,..., n: 


(a) Apply the row operation ‘‘Replace R; by —a,;,R, + a Ri” 
(b) Apply the corresponding column operation ‘‘Replace C; by —a; Ci + a,,C;.”’ 
These operations reduce the matrix M to the form 
fie 0 * o k 
ue Tiaa (*) 
Case II: ay, =0 but a, # 0, for some k > 1. 


(a) Apply the row operation “‘Interchange R; and R,;.”’ 
(b) Apply the corresponding column operation ‘‘Interchange C, and C,.’’ 


(These operations bring a; into the first diagonal position, which reduces the matrix 
to Case I.) 


Case III: All diagonal entries a;; = 0 but some a, # 0. 
(a) Apply the row operation ‘‘Replace R, by R; + Rj.” 
(b) Apply the corresponding column operation ‘‘Replace C; by C; + C;.”’ 


(These operations bring 2a;; into the ith diagonal position, which reduces the matrix 
to Case II.) 


Thus, M is finally reduced to the form (*), where A, is a symmetric matrix of order less than 
A, 


Step 3. Repeat Step 2 with each new matrix A, (by neglecting the first row and column of the 
preceding matrix) until A is diagonalized. Then M is transformed into the form M’ = [D, Q], 
where D is diagonal. 


Step 4. Set P= Q7. Then D = P'AP. 


Remark 1: We emphasize that in Step 2, the row operations will change both sides of M, but the 
column operations will only change the left half of M. 


Remark 2: The condition 1 + 1 0 is used in Case III, where we assume that 2a, 4 0 when 
aj # 0. 


The justification for the above algorithm appears in Problem 12.9. 


1 2 -3 
EXAMPLE 12.2 Let A = 2 5 —4 |. Apply Algorithm 9.1 to find a nonsingular matrix P such 
-3 -4 8 


that D = P'AP is diagonal. 
First form the block matrix M = [A, /]; that is, let 


1 2 -3'1 0 0 
M=(|4j=| 2 5 -4:0 1 0 
-3 —4 810 0 1 


Apply the row operations ‘‘Replace R, by —2R, + R,” and ‘‘Replace R, by 3R; + R,” to M, and then apply the 
corresponding column operations “‘Replace C, by —2C, + C,” and ‘‘Replace C; by 3C, + C;”’ to obtain 


1 2 -3 1 0 0 1 0 0 | 0 0 
0 1 2ı—2 1 0 and then 0 1 2—2 1 0 
0 2 -1' 3 0 1 0 2 -l1 l 0 1 
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Next apply the row operation ‘‘Replace R; by —2R, + R;’’ and then the corresponding column operation “‘Replace 
C; by —2C, + C,” to obtain 


10 oO! 1 0 0 10 0; 1 00 
0 1 2 ,;-2 1 0 and then 0 1 0 1-2 1 0 
00-5! 7 -2 1 00 -5!' 7 -2 1 
Now A has been diagonalized. Set 
1 -2 7 1 0 0 
P=|0 1 -2 andthen D=P'4P=|0 1 0 
0 0 1 0 0 —5 


We emphasize that P is the transpose of the right half of the final matrix. 


Quadratic Forms 
We begin with a definition. 


DEFINITION A: A mapping q:V — K is a quadratic form if q(v) = f(v, v) for some symmetric 
bilinear form f on V. 


If 1 +1 #0 in K, then the bilinear form f can be obtained from the quadratic form q by the following 
polar form of f: 


f(u, v) =4[q(u+ v) — q(u) — q(v)] 


Now suppose f is represented by a symmetric matrix A = [a,], and 1 + 1 # 0. Letting X = [x] 
denote a column vector of variables, q can be represented in the form 


q(x) =f(X,X) = XTAX = De ajx; = Daph g 25 axx; 
ij i ioo 
The above formal expression in the variables x; is also called a quadratic form. Namely, we have the 
following second definition. 


DEFINITION B: A quadratic form q in variables x,,x,...,X, is a polynomial such that every term 
has degree two; that is, 


q(x, X2,- -3 Xn) = oe F > dix 


i i<j 


Using 1 + 1 4 0, the quadratic form q in Definition B determines a symmetric matrix A = [a;;] where 
a; = c; and ay = a; = dj. Thus, Definitions A and B are essentially the same. 
If the matrix representation A of q is diagonal, then q has the diagonal representation 


q(X) = XTAX = axi + ay9x3 +2 + An Xp 
That is, the quadratic polynomial representing q will contain no ‘‘cross product’’ terms. Moreover, by 


Theorem 12.4, every quadratic form has such a representation (when 1 + 1 Æ 0). 


12.6 Real Symmetric Bilinear Forms, Law of Inertia 


This section treats symmetric bilinear forms and quadratic forms on vector spaces V over the real field R. 
The special nature of R permits an independent theory. The main result (proved in Problem 12.14) is as 
follows. 


THEOREM 12.5: — Let f be a symmetric form on V over R. Then there exists a basis of V in which f is 
represented by a diagonal matrix. Every other diagonal matrix representation of f has 
the same number p of positive entries and the same number n of negative entries. 
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The above result is sometimes called the Law of Inertia or Sylvester’s Theorem. The rank and 
signature of the symmetric bilinear form f are denoted and defined by 


rank( f)=p+n and sig(f)=p-n 


These are uniquely defined by Theorem 12.5. 
A real symmetric bilinear form f is said to be 


(i) positive definite if q(v) = f (v, v) > 0 for every v £0, 
(ii) nonnegative semidefinite if q(v) = f (v, v) > 0 for every v. 


EXAMPLE 12.3 Letf be the dot product on R”. Recall that f is a symmetric bilinear form on R”. We note 
that f is also positive definite. That is, for any u = (a;) £ 0 in R”, 
f(uu=@4+G4+--4+0>0 


Section 12.5 and Chapter 13 tell us how to diagonalize a real quadratic form q or, equivalently, a real 
symmetric matrix A by means of an orthogonal transition matrix P. If P is merely nonsingular, then q can 
be represented in diagonal form with only 1’s and —1’s as nonzero coefficients. Namely, we have the 
following corollary. 


COROLLARY 12.6: Any real quadratic form q has a unique representation in the form 


2 2 2 
q(X1, X2; -3 Xn) = XY pae F Xp Xp+1 = xe 


where r = p + n is the rank of the form. 


COROLLARY 12.6: (Alternative Form) Any real symmetric matrix A is congruent to the unique 
diagonal matrix 


D = diag(J,, ~J: 0) 


where r = p + n is the rank of A. 


12.7 Hermitian Forms 


Let V be a vector space of finite dimension over the complex field C. A Hermitian form on V is a 
mapping f:V x V — C such that, for all a,b € C and all u;,v € V, 


(i) f (au; + bu, v) = af (u, v) + bf (u, v), 
(ii) f(u, v) =f(v, u). 


(As usual, k denotes the complex conjugate of k € C.) 
Using (i) and (ii), we get 


f(u, av, + bv) =f(av, + bv, u) = af(v,,u) + bf (vy, u) 
= âf (v1, u) oT bf (v, u) = af (u, v) “Te bf (u, v) 


That is, 
(iii) f (u, avı + buy) = af (u, v) =e bf (u, v). 


As before, we express condition (i) by saying f is linear in the first variable. On the other hand, we 
express condition (iii) by saying f is “‘conjugate linear’’ in the second variable. Moreover, condition (ii) 
tells us that f(v, v) = f (v, v), and hence, f(v, v) is real for every v € V. 

The results of Sections 12.5 and 12.6 for symmetric forms have their analogues for Hermitian forms. 
Thus, the mapping q:V — R, defined by q(v) = f(v, v), is called the Hermitian quadratic form or 
complex quadratic form associated with the Hermitian form f. We can obtain f from q by the polar form 


f(u, v) =3la(u + v) — qalu — v)] + 3lg(u + iv) — g(u — iv)] 
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Now suppose S = {u),...,u,} is a basis of V. The matrix H = [h;] where h; = f (u;, u;) is called the 
matrix representation of f in the basis S. By (ii), f (u;,u;) = f (u;, u;); hence, H is Hermitian and, in 
particular, the diagonal entries of H are real. Thus, any diagonal representation of f contains only real 
entries. 

The next theorem (to be proved in Problem 12.47) is the complex analog of Theorem 12.5 on real 


symmetric bilinear forms. 


THEOREM 12.7: Let f be a Hermitian form on V over C. Then there exists a basis of V in which f is 
represented by a diagonal matrix. Every other diagonal matrix representation of f 
has the same number p of positive entries and the same number n of negative 
entries. 


Again the rank and signature of the Hermitian form f are denoted and defined by 


rank( f) =p+n and sig( f)=p-—n 


These are uniquely defined by Theorem 12.7. 
Analogously, a Hermitian form f is said to be 


(i) positive definite if q(v) = f (v, v) > 0 for every v £0, 
(ii) nonnegative semidefinite if q(v) = f (v, v) > 0 for every v. 


EXAMPLE 12.4 Let f be the dot product on C”; that is, for any u = (z;) and v = (w,) in C”, 
f(4,v) = u: v= z tw +--+ 2m, 
Then f is a Hermitian form on C”. Moreover, f is also positive definite, because, for any u = (z;) 4 0 in C”, 


Sf (uu) = 242, + 29% +++ + ZnZn = ley |? +z + A >0 


SOLVED PROBLEMS 


Bilinear Forms 


12.1. Letu = (x1,X2,xX3) and v = (y1, y2;y3). Express f in matrix notation, where 


f(u, v) = 3x1y1 — 2x1y3 + 5xX2y1 + 7x22 — 8x273 + 4x32 — 6X33 
Let A = [a,], where a; is the coefficient of x;y;. Then 


3 0 -2| |y 
f(u, v) =X"AY = [x1 X2, x3] 5 7 -8 V3 
0 4 —6 


12.2. Let A be an nxn matrix over K. Show that the mapping f defined by f (X, Y) = XTAY is a 
bilinear form on K”. 


For any a,b € K and any X; Y; € K”, 


f (aX, + bX,, Y) = (aX, + bX) AY = (aXf + bX )AY 
= aX] AY + bX; AY = af (X1, Y) + Of (%,Y) 


Hence, f is linear in the first variable. Also, 
f(X, aY, + bY,) = X"A(aY, + bY.) = aX™ AY, + bXTAY, = af (X,Y,) + Of (X, Yo) 


Hence, f is linear in the second variable, and so f is a bilinear form on K”. 
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12.3. Letf be the bilinear form on R? defined by 


12.4. 


12.5. 


Flænx), Wya] = Zayi = 3x1 + 4x2 
(a) Find the matrix A of f in the basis {u; = (1,0), u, = (1,1)}. 
(b) Find the matrix B of f in the basis {v; = (2,1), v, = (1,—1)}. 


(c) Find the change-of-basis matrix P from the basis {u;} to the basis {v;}, and verify that 
B = P'AP. 
(a) Set A = [a;], where a; = f(u;,u;). This yields 


an =f[(1,0), (1,0)] =2-0—-0=2, ay, =f{(1,1), (1,0)] =2-0+0=2 
an = f|(1,0), (1,1) =2-3-0=-1, ay = f|(1,1), (1,1) =2-3+4=3 
Thus, A = f = is the matrix of f in the basis {w,,w}. 


(b) Set B= [b,], where b; = f (v;, v). This yields 


ie fI, 1), (2, )] =8-6+4=6, by =f|(1,—1), (2,1) =4-3-4= -3 
bp =f[(2,1), 1,—1)] =4+6-4=6, by =f[0,-1), (1,—1)] =2+3+4=9 
6 6]. ; : ; 
Thus, B = E J is the matrix of f in the basis {v,, v2}. 
(c) Writing vı and v, in terms of the u; yields v) = uw, + u, and vy = 2u; — uy. Then 
_|1l 2 rfl 1 
reli} =f ol 
Typ | 1 1}/2 —1||1 2] | 6 6] _ 
ag Pap =|) mile ale Ee |= 
Prove Theorem 12.1: Let V be an n-dimensional vector space over K. Let {¢,,...,¢,,} be any 


basis of the dual space V*. Then { f; : i,j =1,...,} is a basis of B(V), where f; is defined by 
J(u, v) = $(u)b;(v). Thus, dim B(V) = n. 


Let {u),...,u,} be the basis of V dual to {¢;}. We first show that { f;;} spans B(V). Let f € B(V) and 
suppose f (u; a) = aj. We claim that f = 7; ; ay fy- It suffices to show that 


Yj 
Ff (us, Uy) = (X ay fj) (Us Ur) for s,t=1,...,n 
We have 
(© ay fj) (Us, Me) =D dy fiUs Me) = X ahi (us) Oj (ue) = E 49s Dr = as =f (Us, Uy) 
as required. Hence, {fp} spans B(V). Next, suppose }- af = 0. Then for s,t=1,..., 
0 = 0(u,,u,) = (0 a, fy) (us, U;) = ys 


The last step follows as above. Thus, { fj} is independent, and hence is a basis of B(V). 


> 


Prove Theorem 12.2. Let P be the change-of-basis matrix from a basis S to a basis S’. Let A be 
the matrix representing a bilinear form in the basis S. Then B = PAP is the matrix representing 
f in the basis S. 


Let u, v € V. Because P is, the change-of-basis matrix from S to S’, we have P[u]ẹ = [u]ş and also 
Pluly = [uly hence, [u]; = [u]gP?. Thus, 


f(u,v) = [uw] Alo] = [u] PTAP[u]y 


Because u and v are arbitrary elements of V, PTAP is the matrix of f in the basis S’. 
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Symmetric Bilinear Forms, Quadratic Forms 

12.6. Find the symmetric matrix that corresponds to each of the following quadratic forms: 
(a) q(x, y, z) = 3x? + 4xy — y? + 8xz — 6yz + 2°, 
(b) (x,y,z) = 3x? + xz — 2yz, (c) q" (x,y,z) = 2x? — 5y? — 72 


The symmetric matrix A = [a,] that represents q(x,,...,x,,) has the diagonal entry a; equal to the 
coefficient of the square term x? and the nondiagonal entries a; and a; each equal to half of the coefficient 
of the cross-product term x;x;. Thus, 


3 2 4 3 0 ł 2 0 0 
(a) A=|]2 -1 -3],() 4 =]0 0 -1},@ AY=|]0 -5 0 
4 -3 1 >; -1 0 0 0 -7 


The third matrix A” is diagonal, because the quadratic form q” is diagonal; that is, q” has no cross-product 
terms. 


12.7. Find the quadratic form g(X) that corresponds to each of the following symmetric matrices: 


5 -3 senn ER 

(a) A= .(b) B=|]-5 -6 8],(c) C= 
-3 8 Ta =f -6 3 9 
5 8 91 


The quadratic form q(X) that corresponds to a symmetric matrix M is defined by g(X) = X™MX, 
where X = |x,] is the column vector of unknowns. 


(a) Compute as follows: 


aly) =X7AK = byl] > H H = [5x — 3y, —3x + 8] H 


= 5x? — 3xy — 3xy + By" = 5x? — 6xy + 8y? 


As expected, the coefficient 5 of the square term x? and the coefficient 8 of the square term y? are 
the diagonal elements of A, and the coefficient —6 of the cross-product term xy is the sum of 
the nondiagonal elements —3 and —3 of A (or twice the nondiagonal element —3, because A is 
symmetric). 


(b) Because B is a three-square matrix, there are three unknowns, say x,y,z Or X1,X7,x3. Then 


q(x, y, Z) = 4x" — 10xy — 6y? + 14xz + 16yz — 927 


or q(x1,xX2,x3) = 4x? = Oxi — 6x2 + 14x xy + Vous, — 9x2 


Here we use the fact that the coefficients of the square terms Aae (or x?, y?,z?) are the respective 
diagonal elements 4, —6, —9 of B, and the coefficient of the cross-product term x;x; is the sum of the 
nondiagonal elements b;; and b;; (or twice b;;, because b; = bj). 

(c) Because C is a four-square matrix, there are four unknowns. Hence, 


2 2 DRA? 
gli Xz Xy ta) = Ai — Taa + 3x3 +.x4 + Bexa 2a 


+ 10xix4 — 12x3x3 + 16x2x4 + 18x3x4 


1 —=3 2 
12.8. Let A= |-—3 7 —5 |. Apply Algorithm 12.1 to find a nonsingular matrix P such that 
2 =5 8 


D = P7AP is diagonal, and find sig(A), the signature of A. 
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12.9. 


12.10. 


12.11. 


First form the block matrix M = [A, I]: 


i 23 21100 
M=(|4,)=|-3 7 -5,0 1 0 
2 -5 B16 0 1 


Using a,; = 1l as a pivot, apply the row operations ‘‘Replace R, by 3R; + R,” and ‘‘Replace R, by 
—2R, + R” to M and then apply the corresponding column operations ‘‘Replace C, by 3C, + C,” and 
“Replace C} by —2C, + C3” to A to obtain 


1-3 2! 100 1 00! 100 
0-2 1; 3 1 0 and then 0-2 1, 3 1 0 
0 141-201 0 141-201 


Next apply the row operation ‘‘Replace R} by R, + 2R,” and then the corresponding column operation 
“Replace C} by C, + 2C,” to obtain 


1 0 0 1 0 0 
and then 0 -2 0 3 1 0 
0 18 -1 1 2 


1 0 0 1 0 
0 -2 1 3 1 
0 0 9 -1 1 0 


NOOO 


Now A has been diagonalized and the transpose of P is in the right half of M. Thus, set 


1 3 =l 1 0 0 
P= |0 1 1 andthen D=PY4P=/0 —2 0 
00 2 0 0 18 


Note D has p=2 positive and n = 1 negative diagonal elements. Thus, the signature of A is 
sig(4)=p-n=2-1=l1. 


Justify Algorithm 12.1, which diagonalizes (under congruence) a symmetric matrix A. 


Consider the block matrix M = [A, I]. The algorithm applies a sequence of elementary row operations 
and the corresponding column operations to the left side of M, which is the matrix A. This is equivalent to 
premultiplying A by a sequence of elementary matrices, say, E}, E2,- . . , E, and postmultiplying A by the 
transposes of the Z;. Thus, when the algorithm ends, the diagonal matrix D on the left side of M is equal to 


D=E,---E,E,AE| E} ---E! = QAQ", where Q=E,---E,E, 


On the other hand, the algorithm only applies the elementary row operations to the identity matrix Z on the 
right side of M. Thus, when the algorithm ends, the matrix on the right side of M is equal to 


E,- EE l = E,---E,E, = Q 
Setting P = Q7, we get D = PAP, which is a diagonalization of A under congruence. 
Prove Theorem 12.4: Let f be a symmetric bilinear form on V over K (where 1 + 1 # 0). Then 


V has a basis in which f is represented by a diagonal matrix. 


Algorithm 12.1 shows that every symmetric matrix over K is congruent to a diagonal matrix. This is 
equivalent to the statement that f has a diagonal representation. 


Let q be the quadratic form associated with the symmetric bilinear form f. Verify the polar 
identity f (u, v) = 4 [g(u + v) — q(u) — q(v)]. (Assume that 1 + 1 4 0.) 


We have 


g(u +v) — q(u) — q(v) = flu + v, u+ v) -f(u u) -f(v o) 
= f(u,u) +f (u, v) + £(0,u) +f (o, 0) -f(u u) — f (o, v) = 2f (u, v) 


If 1 +1 #0, we can divide by 2 to obtain the required identity. 
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12.12. 


12.13. 


12.14. 


Consider the quadratic form q(x, y) = 3x? + 2xy — y? and the linear substitution 
x=s—3t, y=2s+t 
(a) Rewrite g(x,y) in matrix notation, and find the matrix A representing q(x, y). 


(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to 
the substitution. 


(c) Find q(s,t) using direct substitution. 
(d) Find q(s,t) using matrix notation. 


(a) Here g(x.) = baal} i} [5]: t5.4= f |]: and a) = x7, where X = lei". 


(b) Here | = E = [7| ths, P= : “ilama = Hee B and X = PY. 


(c) Substitute for x and y in q to obtain 


qls, t) = 3(s — 3t)? + 2(s — 3t) (2s + t) — (2s + t}? 
= 3(s” — 6st + 91) + 2(2s? — Sst — 327) — (45? + 4st + Ê) = 3s? — 32st + 207 


(d) Here q(X) = X7AX and X = PY. Thus, XT = YTP". Therefore, 


easar IE TE IE 


3 —16]fs 3 > 
= |s, 4] = 3s" — 32st + 20t 
—16 20}[¢ 


[As expected, the results in parts (c) and (d) are equal.] 


Consider any diagonal matrix A = diag(a,,...,a,) over K. Show that for any nonzero scalars 
k,,...,k, € K,A is congruent to a diagonal matrix D with diagonal entries a,k?,...,a,k?. 
Furthermore, show that 


(a) If K = C, then we can choose D so that its diagonal entries are only 1’s and 0’s. 
(b) If K = R, then we can choose D so that its diagonal entries are only 1’s, —1’s, and 0’s. 
Let P = diag(k,,...,x,,). Then, as required, 


on 


D = P'AP = diag(k;) diag(a;) diag(k;) = diag(a,k7,...,a,k?) 


(a) Let P = diag(b;), where b; = m 1//a; T nga # A 


Then P7AP has the required form. 


(b) Let P = diag(b;), where b; = { a la;l 7 a; = 
1 a, = 


Then P7AP has the required form. 


Remark: We emphasize that (b) is no longer true if ‘‘congruence’’ is replaced by 
‘‘Hermitian congruence.”’ 


Prove Theorem 12.5: Let f be a symmetric bilinear form on V over R. Then there exists a basis 
of V in which f is represented by a diagonal matrix. Every other diagonal matrix representation 
of f has the same number p of positive entries and the same number n of negative entries. 


By Theorem 12.4, there is a basis {u,,...,u,} of V in which f is represented by a diagonal matrix 
with, say, p positive and n negative entries. Now suppose {w,,..., w, } is another basis of V, in which f is 
represented by a diagonal matrix with p’ positive and n’ negative entries. We can assume without loss of 
generality that the positive entries in each matrix appear first. Because rank( f) = p +n = p' +1’, it 
suffices to prove that p = p’. 
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Let U be the linear span of u, .. . , Up and let W be the linear span of wy ),...,W,. Then f(v, v) > 0 
for every nonzero v € U, and f(v, v) < 0 for every nonzero v € W. Hence, U N W = {0}. Note that 
dim U = p and dim W = n — p’. Thus, 


dim(U + W) = dim U + dimW — dim(UN W) = p+ (n— p) —-0=p-p' +n 


But dim(U + W) < dim V = n; hence, p — p' +n < n or p < p’. Similarly, p’ < p and therefore p = p’, 
as required. 


Remark: The above theorem and proof depend only on the concept of positivity. Thus, the 
theorem is true for any subfield K of the real field R such as the rational field Q. 


Positive Definite Real Quadratic Forms 


12.15. 


12.16. 


12.17. 


Prove that the following definitions of a positive definite quadratic form q are equivalent: 


(a) The diagonal entries are all positive in any diagonal representation of q. 
(b) qg(Y) > 0, for any nonzero vector Y in R”. 

Suppose q(Y) = ayy + ay5 +--+ + a„y2. If all the coefficients are positive, then clearly q4(Y) > 0 
whenever Y +Æ 0. Thus, (a) implies (b). Conversely, suppose (a) is not true; that is, suppose some diagonal 
entry a, < 0. Let e, = (0,...,1,...0) be the vector whose entries are all 0 except 1 in the kth position. 
Then q(e,) = a, is not positive, and so (b) is not true. That is, (b) implies (a). Accordingly, (a) and (b) are 
equivalent. 


Determine whether each of the following quadratic forms q is positive definite: 
(a) q(x,y,z) =x? + 2y? — 4xz — 4yz + 717? 
(b) 9(x,y,2) = 


Diagonalize (under congruence) the symmetric matrix A corresponding to q. 


(a) Apply the operations ‘‘Replace R, by 2R; + R3” and ‘‘Replace C; by 2C, + C;,’’ and then “‘Replace 
R, by R) + R,” and “‘Replace C; by C, + C3.” These yield 


y? + 2xz + 4yz + 327 


1 0 -2 1 0 0 1 0 0 
A= 0 2 -2}2~]/0 2 -2/~/0 2 0 
=2 =2 7 0 -2 3 0 0 1 


The diagonal representation of q only contains positive entries, 1,2,1, on the diagonal. Thus, q is 
positive definite. 


(b) We have 
0 


1 0 1 1 0 0 1 0 
A=J|O 1 2}/2]/0 1 2|~]0O 1 0 
1 2 3 02 2 0 0 -2 
There is a negative entry —2 on the diagonal representation of q. Thus, q is not positive definite. 
Show that q(x, y) = ax? + bxy + cy’ is positive definite if and only if a > 0 and the discriminant 
D = b? — 4ac < 0. 
Suppose v = (x,y) Æ 0. Then either x 4 0 or y 4 0; say, y # 0. Let t = x/y. Then 


q(v) = y lalx/y}? + b(x/y) + c] =y (aP + bt +c) 


However, the following are equivalent: 

(i) s=af +bt+ c is positive for every value of t. 
Gi) s = at? + bt+c lies above the t-axis. 
(iii) a>Oand D = b? — 4ac < 0. 
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Thus, q is positive definite if and only if a > 0 and D < 0. [Remark: D < 0 is the same as det(A) > 0, 
where A is the symmetric matrix corresponding to q.] 
12.18. Determine whether or not each of the following quadratic forms q is positive definite: 


(a) g(x,y) =x — 4xy + Ty", (b) qy) =x + Bx + 5, ©) q(x y) = 3° + Ixy +y? 
Compute the discriminant D = b? — 4ac, and then use Problem 12.17. 

(a) D = 16 — 28 = —12. Because a = 1 > 0 and D < 0,¢ is positive definite. 

(b) D = 64 — 20 = 44. Because D > 0,q is not positive definite. 

(c) D=4-—12=-8. Because a = 3 > 0 and D < 0,q is positive definite. 


Hermitian Forms 


12.19. Determine whether the following matrices are Hermitian: 


2 2+3i 4—5i 3 2-i 4+i 4 -3 5 
(a) |2-3i 5 6+2iļl,œ) |2-i 6 i |© hoe. 2 1 
4+5i 6—2i -7 4+i i 7 5 1 —6 


A complex matrix A = [a;] is Hermitian if A* = A—that is, if aj = &;. 
(a) Yes, because it is equal to its conjugate transpose. 
(b) No, even though it is symmetric. 


(c) Yes. In fact, a real matrix is Hermitian if and only if it is symmetric. 


12.20. Let 4 be a Hermitian matrix. Show that f is a Hermitian form on C” where f is defined by 
f (X,Y) =X. 


For all a,b € C and all X,,X5,Y € C”, 


f (aX, + bX, Y) = (aX, + bX) AY = (aX? + bXP)AY 
= aX{ AY + bX} AY = af (%, Y) + bf (X, Y) 


Hence, f is linear in the first variable. Also, 


F(X, Y) = XTAY = (XTAY) = YTATX = YTA*X = YTAX = f(Y,X) 


Hence, f is a Hermitian form on C”. 


Remark: We use the fact that XTAY is a scalar and so it is equal to its transpose. 


12.21. Let f be a Hermitian form on V. Let H be the matrix of f in a basis S = {u;} of V. Prove the 
following: 


(a) f(u, v) = [u] H]o]s for all u, v € V. 
(b) If P is the change-of-basis matrix from S to a new basis S' of V, then B = PTHP (or 
B = Q*HO, where Q = P) is the matrix of f in the new basis S”. 


Note that (b) is the complex analog of Theorem 12.2. 
(a) Let u,v € V and suppose u = aju; +---+a,u, and v= bju + ---+5,u,. Then, as required, 


flu, v) =f (ayu Fie F apn, biu Ae ee + bpn) 
a È abf (uj, v;) = ay, cd . 4, |H[by, itty VAH _ [us [us 
ly 


ED CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 


12.22. 


(b) Because P is the change-of-basis matrix from S to S’, we have Plu], = [u]s and P[u]y = [v]s; hence, 
[us = [u]sP? and [0]; = Plu]y. Thus, by (a), 


f(u, v) = [als HĮo]s = [uly P" HPI 


But u and v are arbitrary elements of V; hence, PTHP is the matrix of f in the basis S’. 


1 1+i 2i 
Le H= |l—i 4 2 — 3i | , a Hermitian matrix. 
—2i 2+3i 7 
Find a nonsingular matrix P such that D = PTHP is diagonal. Also, find the signature of H. 


Use the modified Algorithm 12.1 that applies the same row operations but the corresponding conjugate 
column operations. Thus, first form the block matrix M = [H, I]: 


1 1+i 2i 1 0 0 
M=)1-i 4 2-31 0 1 0 
—2i 23i 7 0 0 1 


Apply the row operations ‘‘Replace R, by (—1 + i)R; + R,” and ‘‘Replace R, by 2iR, + R,” and then the 
corresponding conjugate column operations ‘‘Replace C, by (—1—i)C; + C2” and ‘‘Replace C} by 
—2iC; + C3” to obtain 


1 i+i 2i 1 0 0 1 0 0 1 0 0 
0 2 —5i -l+i 1 0 and then 0 2 —-Si —l+i 1 0 
0 Si 3 2i 0 1 0 5i 3 2i 0 1 


Next apply the row operation ‘“‘Replace R} by —5iR, + 2R,” and the corresponding conjugate column 
operation ‘‘Replace C; by 5iC, + 2C,” to obtain 


1 0 0 1 0 0 10 0 1 0 0 
0 2 —5i —l+i 1 0 and then 0 2 0 —l +i 1 0 
0 0 -19 5+9i —5i 2 0 0 —38 5+9i —5i 2 


Now H has been diagonalized, and the transpose of the right half of M is P. Thus, set 


1 -l+i 549i 1 0 0 
P= |0 1 —5i |, andthen D=P"HP=]|0 2 0 
0 0 2 0 0 -38 


Note D has p=2 positive elements and n= 1 negative elements. Thus, the signature of H is 
sig(H) =2-1=1. 


Miscellaneous Problems 


12.23. 


Prove Theorem 12.3: Let f be an alternating form on V. Then there exists a basis of V in which f 


-1 0 
of nonzero blocks is uniquely determined by f [because it is equal to 5 rank( f)}. 


If f = 0, then the theorem is obviously true. Also, if dim V = 1, then f (kuu, kyu) = kiko f (u, u) = 0 
and so f = 0. Accordingly, we can assume that dim V > 1 and f Æ 0. 

Because f # 0, there exist (nonzero) u4, u € V such that f (u;, u) Æ 0. In fact, multiplying u; by 
an appropriate factor, we can assume that f(u, u2) = 1 and so f(u),u,) = —1. Now uw, and uy are 
linearly independent; because if, say, u, = kuj, then f(u,,u.) = f(u, kui) = kf(u,,u,) =0. Let 
U = span(u;, uz); then, 


is represented by a block diagonal matrix M with blocks of the form | al | or 0. The number 
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(i) The matrix representation of the restriction of f to U in the basis {u}, uz} is È | ; 
(ii) Ifu €U, say u = au, + buy, then 
f(u,u) =f(au + bu, u)=-b and f(u,u)=f(au +bu, wy) =a 
Let W consists of those vectors w € V such that f (w, u) = 0 and f (w, uz) = 0. Equivalently, 
W = {w E€ V : f(w,u) = 0 for every u € U} 


We claim that V = U @ W. It is clear that U N W = {0}, and so it remains to show that V = U + W. Let 
v € V. Set 


u = f (v, upju; — f (v, u Juz and w=v-—u (1) 


Because u is a linear combination of u, and u, u € U. 
We show next that w € W. By (1) and (ii), f (u, u1) = f (v, u; ); hence, 


f(w,u,) =f(u—u, u) =f(v, u) -f(u u)=0 
Similarly, f (u, uz) = f (v, u2) and so 
F(W, tn) + f(v =u, u) =f (v, u) — f(u, u) = 0 


Then w € W and so, by (1), v = u + w, where u € W. This shows that V = U + W; therefore, V = U @ W. 

Now the restriction of f to W is an alternating bilinear form on W. By induction, there exists a basis 
U3,...,U, Of W in which the matrix representing f restricted to W has the desired form. Accordingly, 
Uj, Uz, U3, ..., U, is a basis of V in which the matrix representing f has the desired form. 


SUPPLEMENTARY PROBLEMS 


Bilinear Forms 


12.24. 


12.25. 


12.26. 


12.27. 


12.28. 


Let u = (x1,x2) and v = (y1, y2). Determine which of the following are bilinear forms on R°: 


(a) f(u, v) = 2x1 = 3x1, ©) f(u, v) = 3x2, (e) fuv) =1, 
(b) f(u,v) =x, +2, (d) f(u, v) =x +y ®© fu, v) =0 


Let f be the bilinear form on R° defined by 
f(@1,%2), Wy) = 3x1 — 2x2 + Aayi — x22 
(a) Find the matrix A of f in the basis {u; = (1,1), u = (1,2)}. 


(b) Find the matrix B of f in the basis {v, = (1,—1), v = (3, 1)}. 
(c) Find the change-of-basis matrix P from {u;} to {v;}, and verify that B = PAP. 


Let V be the vector space of two-square matrices over R. Let M = E : 


where A,B € V and ‘‘tr’’ denotes trace. (a) Show that f is a bilinear form on V. (b) Find the matrix of f in 


to ol [o of [t of [o 11) 


Let B(V) be the set of bilinear forms on V over K. Prove the following: 


(a) Iff,g € B(V), then f +g, kg € B(V) for any k € K. 
(b) If # and o are linear functions on V, then f(u, v) = b(u)a(v) belongs to B(V). 


|. and let f(A, B) = tr(47 MB), 


the basis 


Let | f] denote the matrix representation of a bilinear form f on V relative to a basis {u;}. Show that the 
mapping f +> [f] is an isomorphism of B(V) onto the vector space V of n-square matrices. 
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12.29. Let f be a bilinear form on V. For any subset S of V, let 
S+ = {v € V : f(u, v) = 0 for every u € S} and S! = {v € V : f(v, u) = 0 for every u € S} 


Show that: (a) S! and S' are subspaces of V; (b) Sı CS, implies Sž C SŁ and SJ CS; 
i T 
(c) {0} ={0} =F. 


12.30. Suppose f is a bilinear form on V. Prove that: rank( f) = dim V — dim V+ = dim V — dim V", and hence, 
dim V+ = dim V". 


12.31. Letf be a bilinear form on V. For each u € V, let ú:V — K andu:V — K be defined by u(x) = f (x, u) and 
u(x) = f (u,x). Prove the following: 


(a) wand ŭ are each linear; i.e., û, ü € V*, 
(b) u— û and wt ware each linear mappings from V into V*, 
(c) rank( f) = rank(u — û) = rank(u + i). 


12.32. Show that congruence of matrices (denoted by ~) is an equivalence relation; that is, 
(i) A~ A; (ii) IfA ~ B, then B ~ A; (iii) If A ~ B and B ~ C, then A ~ C. 


Symmetric Bilinear Forms, Quadratic Forms 


12.33. Find the symmetric matrix A belonging to each of the following quadratic forms: 


(a) q(x,y,z) — 2x? — 8xy + y? — 16xz + 14yz + 52’, (c) g(x,y,z) = xy +y? + 4xz + 2 
(b) 9(x,y,2) =x = xz ty’, (d 9(x,y,z) = xy +yz 


12.34. For each of the following symmetric matrices A, find a nonsingular matrix P such that D = P'AP is 


diagonal: 
1 0 2 1 -2 1 4 ee A 
(a) A=|]0 3 6|, 4=]-2 5 3],() A= 
2 6 7 1 3-2 9 kob 
2 02 -l 


12.35. Let q(x, y) = 2x? — 6xy — 3y? and x=s+2t, y = 3s — t. 


(a) Rewrite g(x,y) in matrix notation, and find the matrix A representing the quadratic form. 


(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the 
substitution. 


(c) Find q(s,t) using (i) direct substitution, (ii) matrix notation. 


12.36. For each of the following quadratic forms q(x,y,z), find a nonsingular linear substitution expressing the 
variables x,y,z in terms of variables 7, s,t¢ such that q(r,s,t) is diagonal: 


2 


(a) q(x, y,2Z) =x 6xy t 8y? 4xz t 2yz 922, 
(b) (x,y,z) = 2x? — 3y? + 8xz + 12yz + 252", 
(c) q(x,y,z) =x? + 2xy + 3y? + 4xz + 8yz + 62°. 


In each case, find the rank and signature. 


12.37. Give an example of a quadratic form q(x, y) such that q(u) = 0 and q(v) = 0 but q(u + v) £0. 


12.38. Let S(V) denote all symmetric bilinear forms on V. Show that 
(a) S(V) is a subspace of B(V); (b) If dim V = n, then dim S(V) = 3n(n + 1). 


12.39. Consider a real quadratic polynomial g(x,,...,x,) = Xij ajXx;x;, Where ay = a; 


ys jit 
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(a) Ifa 40, show that the substitution 


x1, = Yi a (aiy parsi a1nYn), X= J; ney Xn = Yn 
11 
yields the equation q(x,,...,X,) = a11 Y? + d'2, - - - Yn), Where q’ is also a quadratic polynomial. 
(b) Ifa; = 0 but, say, aj, # 0, show that the substitution 
X =) +a; X2 =Yi =), X3 = y3, vey Xn = Vn 
yields the equation q(x,,...,%,) = >) by Yi yj, where b,, # 0, which reduces this case to case (a). 


Remark: This method of diagonalizing g is known as completing the square. 


Positive Definite Quadratic Forms 


12.40. 


12.41. 


12.42. 


Determine whether or not each of the following quadratic forms is positive definite: 
42° 
72 


(a) g(x,y) =4x? +5xy+7y, (©) (x,y,z) =x 
(b) g(x,y) = 2x? — 3xy — y’, (d q(x, y, z) =x? 


4xy 


2xy 


Find those values of k such that the given quadratic form is positive definite: 


(a) g(x,y) = 2x? — S5xy +h, (b) g(x,y) = 3x? — kwy + 12y? 
(©) q(x, y, Zz) = x? + 2xy + 2y? + 2xz + 6yz + kz? 


Suppose A is a real symmetric positive definite matrix. Show that A = P’P for some nonsingular matrix P. 


Hermitian Forms 


12.43. 


12.44. 


12.45. 


12.46. 


12.47. 


Modify Algorithm 12.1 so that, for a given Hermitian matrix H, it finds a nonsingular matrix P for which 
D = P'AP is diagonal. 


For each Hermitian matrix H, find a nonsingular matrix P such that D = PTHP is diagonal: 


: P 1 i 2+i 
(a) =| | J (b) Tela ale i we) = 2 1-i 
f i 3—i Ii 2 


Find the rank and signature in each case. 
Let A be a complex nonsingular matrix. Show that H = A*A is Hermitian and positive definite. 


We say that B is Hermitian congruent to A if there exists a nonsingular matrix P such that B = PAP or, 
equivalently, if there exists a nonsingular matrix Q such that B = Q*AQ. Show that Hermitian congruence 
is an equivalence relation. (Note: If P = Q, then PTAP = O*AQ.) 


Prove Theorem 12.7: Let f be a Hermitian form on V. Then there is a basis S of V in which f is represented 
by a diagonal matrix, and every such diagonal representation has the same number p of positive entries and 
the same number n of negative entries. 


Miscellaneous Problems 


12.48. 


12.49. 


Let e denote an elementary row operation, and let f* denote the corresponding conjugate column operation 
(where each scalar k in e is replaced by k in f*). Show that the elementary matrix corresponding to f* is 
the conjugate transpose of the elementary matrix corresponding to e. 


Let V and W be vector spaces over K. A mapping f:V x W — K is called a bilinear form on V and W if 
(i) flav + bv, w) = af(v,,w) + bf (v2, w), 
Gi) f(v, aw, + bw) = af(v,w;) + bf (v, w2) 

for every a,b E€ K, v; E€ V,w; € W. Prove the following: 
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12.50. 


(a) The set B(V,W) of bilinear forms on V and W is a subspace of the vector space of functions from 
V x W into K. 


(b) If {),---,%,} is a basis of V* and {o,,...,0,} is a basis of W*, then 
{fj:i=l,...,m,j=1,...,n} is a basis of B(V, W), where f; is defined by f,(v, w) = ;(v)a;(w). 
Thus, dim B(V, W) = dim V dim W. 

[Note that if V = W, then we obtain the space B(V) investigated in this chapter.] 
m times 


Oo 
Let V be a vector space over K. A mapping f:V x V x ... x V — K is called a multilinear (or m-linear) 
form on V iff is linear in each variable; that is, for i = 1,...,m, 


f(..., au+bv, ...) =af(...,i,...) £Of(...,8,...) 


where -.. denotes the ith element, and other elements are held fixed. An m-linear form f is said to be 
alternating if f (v, . . . Um) = 0 whenever v; = v; for i # j. Prove the following: 


(a) The set B,,(V) of m-linear forms on V is a subspace of the vector space of functions from 
VxVx---x V into K. 


(b) The set A,,(V) of alternating m-linear forms on V is a subspace of B,,(V). 
Remark 1: If m = 2, then we obtain the space B(V) investigated in this chapter. 


Remark 2: If V = K”, then the determinant function is an alternating m-linear form on V. 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: M=[R,; R,; ...] denotes a matrix M with rows R,,R),.... 
12.24. (a) yes, (b) no, (c) yes, (d) no, (e) no, (f) yes 
12.25. (a) A= (4,1; 7,3], (b) B=[0,-4; 20,32], (©) P=([3,5; —2,—2] 
12.26. (b) [1,0,2,0; 0,1,0,2; 3,0,5,0; 0,3,0,5] 
12.33. (a) [2,-4,-8; -4,1,7; -8,7,5],  (b) [1,0,-4; 0,1,0; —4,0,0), 

(c) (0,552; 5, 1,0; 2,0, 1], (d) (0,3, A 5,0, : 5,0,4; 0,4,0] 
12.34. (a) P= [1,0,—2; 0,1,—2; 0,0,1], D= diag(1,3,—9); 


12.35. 


12.36. 


12.37. 


12.40. 


12.41. 


12.44. 


P=[1,0,-2; 0, 
(b) P=[(1,2,-11; 0,1,—5; 0,0, 1], D=diag(1, 1, —28); 
(c) P=[(1,1,—-1,—4; 0,1,—1,—2; 0,0,1,0; 0,0,0,1], D = diag(1,1,0,—9) 


A=(2,-3; —3,—3], P = [1,2; 3,—1], q(s, t) = —43s? — 4st + 170 
(a) x=r—3s—19, y=s+7t, z=t; q(r,s,) =r — s + 36°; 
b) x=r—2t, y=s+2t, z=t, q(r,s,t) = 2r? — 3s? + 29f; 
(c) x=r-s-t, y=s—-t, Z=6 q(r,s,t) =r? — 2s? 

q(x,y) =x — y’, u= (1,1), v= (1,-1) 

(a) yes, (b) no, (c) no, (d) yes 

(a) k>, (Œ) =-12<k<12, (c) k>5 


(a) 
(c) 


(1,7 0,1],D=I,s=2; (b) P=[1,—-2+3i; 0,1], D= diag(1,—14), s = 0; 


P= 
P=(1,i,-3+3%; 0,1,é 0,0,1], D = diag(1,1,—4),s = 1 


Linear Operators on Inner 
Product Spaces 


13.1 Introduction 


This chapter investigates the space A(V) of linear operators T on an inner product space V. (See 

Chapter 7.) Thus, the base field K is either the real numbers R or the complex numbers C. In fact, different 

terminologies will be used for the real case and the complex case. We also use the fact that the inner 

products on real Euclidean space R” and complex Euclidean space C” may be defined, respectively, by 
(u,v) = u"v and (u,v) =u" 

where u and v are column vectors. 

The reader should review the material in Chapter 7 and be very familiar with the notions of norm 
(length), orthogonality, and orthonormal bases. We also note that Chapter 7 mainly dealt with real inner 
product spaces, whereas here we assume that V is a complex inner product space unless otherwise stated 
or implied. 

Lastly, we note that in Chapter 2, we used A” to denote the conjugate transpose of a complex matrix A; 
that is, A” = A’. This notation is not standard. Many texts, expecially advanced texts, use A* to denote 
such a matrix; we will use that notation in this chapter. That is, now A* = A’. 


13.2 Adjoint Operators 


We begin with the following basic definition. 


DEFINITION: A linear operator T on an inner product space V is said to have an adjoint operator T* 
on V if (T(u), v} = (u, T*(v)} for every u, v € V. 


The following example shows that the adjoint operator has a simple description within the context of 
matrix mappings. 


EXAMPLE 13.1 
(a) Let A be a real n-square matrix viewed as a linear operator on R”. Then, for every u,v € R,, 
(Au, v) = (Au) v = uT AT v = (u, AT v) 
Thus, the transpose A’ of A is the adjoint of A. 
(b) Let B be a complex n-square matrix viewed as a linear operator on C”. Then for every u, v, € C”, 
(Bu, v) = (Bu) 0 = ul Bd = uT B*0 = (u, B*v) 


Thus, the conjugate transpose B* of B is the adjoint of B. 


— eap 
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Remark: B* may mean either the adjoint of B as a linear operator or the conjugate transpose of B 
as a matrix. By Example 13.1(b), the ambiguity makes no difference, because they denote the same 
object. 


The following theorem (proved in Problem 13.4) is the main result in this section. 


THEOREM 13.1: Let T be a linear operator on a finite-dimensional inner product space V over K. 
Then 


(i) There exists a unique linear operator T* on V such that (T (u), v) = (u, T*(v)) 
for every u,v € V. (That is, T has an adjoint 7*.) 


(ii) If A is the matrix representation T with respect to any orthonormal basis 
S = {u;} of V, then the matrix representation of T* in the basis S is the 
conjugate transpose A* of A (or the transpose A’ of A when K is real). 


We emphasize that no such simple relationship exists between the matrices representing T and 7* if 
the basis is not orthonormal. Thus, we see one useful property of orthonormal bases. We also emphasize 
that this theorem is not valid if V has infinite dimension (Problem 13.31). 

The following theorem (proved in Problem 13.5) summarizes some of the properties of the adjoint. 


THEOREM 13.2: Let T,7,, 7, be linear operators on V and let k € K. Then 
O rere, Gii) 7b) = Br, 
Gi) (kT)* = kT*, (iv) (T*)* =T. 


Observe the similarity between the above theorem and Theorem 2.3 on properties of the transpose 
operation on matrices. 


Linear Functionals and Inner Product Spaces 


Recall (Chapter 11) that a linear functional ¢ on a vector space V is a linear mapping ¢:V — K. This 
subsection contains an important result (Theorem 13.3) that is used in the proof of the above basic 
Theorem 13.1. 

Let V be an inner product space. Each u € V determines a mapping û:V — K defined by 


a(u) = (v,u) 
Now, for any a,b € K and any v, v2 E V, 
(av, + bv) = (av, + bv, u) = a(v,,u) + b(vy,u) = aii(v,) + bû(v,) 


That is, û is a linear functional on V. The converse is also true for spaces of finite dimension and it is 
contained in the following important theorem (proved in Problem 13.3). 


THEOREM 13.3: Let ¢ be a linear functional on a finite-dimensional inner product space V. Then 
there exists a unique vector u € V such that (v) = (v, u) for every v € V. 


We remark that the above theorem is not valid for spaces of infinite dimension (Problem 13.24). 


13.3 Analogy Between A(V) and C, Special Linear Operators 


Let A(V) denote the algebra of all linear operators on a finite-dimensional inner product space V. The 
adjoint mapping T ++ T* on A(V) is quite analogous to the conjugation mapping z +> Z on the complex 
field C. To illustrate this analogy we identify in Table 13-1 certain classes of operators T € A(V) whose 
behavior under the adjoint map imitates the behavior under conjugation of familiar classes of complex 
numbers. 

The analogy between these operators T and complex numbers z is reflected in the next theorem. 
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Table 13-1 
Class of complex Behavior under Behavior under the 
numbers conjugation Class of operators in A(V) adjoint map 
Unit circle (|z| = 1) Z=1/z Orthogonal operators (real case) E= T~ 


Unitary operators (complex case) 


Self-adjoint operators 
Also called: 

Real axis 7=Z symmetric (real case) T*=T 
Hermitian (complex case) 


Skew-adjoint operators 
Also called: 

Imaginary axis 7S +7 skew-symmetric (real case) T*=-T 
skew-Hermitian (complex case) 


Positive real axis z=ww,w #0 Positive definite operators T = S*S 
(0, co) with S nonsingular 


THEOREM 13.4: Let 2 be an eigenvalue of a linear operator T on V. 
(i) If T*= T7! (ie., T is orthogonal or unitary), then |A| = 1. 
(ii) If 7*=T (i.e., T is self-adjoint), then / is real. 
Gii) If 7* = —T (i.e., T is skew-adjoint), then / is pure imaginary. 


(iv) If T = S*S with S nonsingular (i.e., T is positive definite), then / is real and 
positive. 


Proof. In each case let v be a nonzero eigenvector of T belonging to A; that is, T(v) = Av with 
v # 0. Hence, (v, v) is positive. 


Proof of (i). We show that 22(v, v) = (v, v): 
Adv, v) = (Av, dv) = (T(v), T(v)) = (v, T*T(v)) = (v,1(v)) = (v, v) 
But (uv, v) Æ 0; hence, 42 = 1 and so |A| = 1. 
Proof of (ii). We show that 2(v, v) = 2(v, v): 
Av, v) = (Av, v) = (T(v), v) = (w, T*(v)) = (v, T(v)) = (w, Av) = Av, v) 
But (v, v) 4 0; hence, 2 = 2 and so 4 is real. 


Proof of (iii). We show that 2(v, v} = —A(, v): 
Av, v) = (dv, v) = (T(v), v) = (v, T*(v)) = (v, -T(v)) = (v, -Av) = —A(v, v) 
But (uv, v) Æ 0; hence, 4 = —A or A = —/, and so / is pure imaginary. 


Proof of (iv). Note first that S(v) 4 0 because S is nonsingular; hence, (S(v), S(v)) is positive. We 
show that A(v, v) = (S(v), S(v)): 


Atv, v) = (Av, 0) = (T(v), v) = (S*S(v), 0) = (S(v), S(v)) 


But (v, v) and (S(v),S(v)) are positive; hence, / is positive. 
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Remark: Each of the above operators T commutes with its adjoint; that is, TT* = T*T. Such 
operators are called normal operators. 


13.4 Self-Adjoint Operators 


Let T be a self-adjoint operator on an inner product space V; that is, suppose 
[=f 


(If T is defined by a matrix A, then A is symmetric or Hermitian according as A is real or complex.) By 
Theorem 13.4, the eigenvalues of T are real. The following is another important property of T. 


THEOREM 13.5: Let T be a self-adjoint operator on V. Suppose u and v are eigenvectors of T 
belonging to distinct eigenvalues. Then u and v are orthogonal; that is, (u, v} = 0. 


Proof. Suppose T(u) = 4,u and T(v) = A,v, where 4; # 23. We show that A, (u, v) = A,(u, v): 


Ay (u, v) = (Ayu, v) = (T(u), v) E (u, T*(v)) = (u, T(v)) 
= (u, 3v) = A,(u, v) = Ay (u, v) 


(The fourth equality uses the fact that T* = T, and the last equality uses the fact that the eigenvalue 2, is 
real.) Because 2, Æ A,, we get (u,v) = 0. Thus, the theorem is proved. 


13.5 Orthogonal and Unitary Operators 


Let U be a linear operator on a finite-dimensional inner product space V. Suppose 
U* = U`! or equivalently UU* = U*U =I 


Recall that U is said to be orthogonal or unitary according as the underlying field is real or complex. The 
next theorem (proved in Problem 13.10) gives alternative characterizations of these operators. 


THEOREM 13.6: The following conditions on an operator U are equivalent: 
(i) U* = U7!; that is, UU* = U*U = I. [U is unitary (orthogonal).] 
(ii) U preserves inner products; that is, for every v,w € V, 
(U(v), U(w)) = (v, w). 
(iii) U preserves lengths; that is, for every v € V, ||U(v)|] = |Jvl]. 
EXAMPLE 13.2 


(a) Let T:R? — R? be the linear operator that rotates each vector v about the z-axis by a fixed angle 0 as shown in 
Fig. 10-1 (Section 10.3). That is, T is defined by 


T(x,y,z) = (xcos 0 — ysin, xsin +ycos0, z) 
We note that lengths (distances from the origin) are preserved under T. Thus, T is an orthogonal operator. 
(b) Let V be l -space (Hilbert space), defined in Section 7.3. Let T:V — V be the linear operator defined by 
T (ay, 47,43,...) = (0,4), a, 43,...) 


Clearly, T preserves inner products and lengths. However, T is not surjective, because, for example, (1,0,0,...) 
does not belong to the image of T; hence, T is not invertible. Thus, we see that Theorem 13.6 is not valid for 
spaces of infinite dimension. 


An isomorphism from one inner product space into another is a bijective mapping that preserves the 
three basic operations of an inner product space: vector addition, scalar multiplication, and inner 
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products. Thus, the above mappings (orthogonal and unitary) may also be characterized as the 
isomorphisms of V into itself. Note that such a mapping U also preserves distances, because 


||U(v) — U(w)|| = [Uw — w)|| = Ilo — w| 
Hence, U is called an isometry. 
13.6 Orthogonal and Unitary Matrices 


Let U be a linear operator on an inner product space V. By Theorem 13.1, we obtain the following results. 


THEOREM 13.7A: A complex matrix A represents a unitary operator U (relative to an orthonormal 
basis) if and only if A* = A7!, 


THEOREM 13.7B: A real matrix A represents an orthogonal operator U (relative to an orthonormal 
basis) if and only if A? = A7!, 


The above theorems motivate the following definitions (which appeared in Sections 2.10 and 2.11). 
DEFINITION: A complex matrix A for which A* = A7! is called a unitary matrix. 
DEFINITION: A real matrix A for which A? = A`! is called an orthogonal matrix. 

We repeat Theorem 2.6, which characterizes the above matrices. 


THEOREM 13.8: The following conditions on a matrix A are equivalent: 
(i) A is unitary (orthogonal). 
(ii) The rows of A form an orthonormal set. 
(iii) The columns of A form an orthonormal set. 


13.7 Change of Orthonormal Basis 


Orthonormal bases play a special role in the theory of inner product spaces V. Thus, we are naturally 
interested in the properties of the change-of-basis matrix from one such basis to another. The following 
theorem (proved in Problem 13.12) holds. 


THEOREM 13.9: Let {u,,...,u,} be an orthonormal basis of an inner product space V. Then the 
change-of-basis matrix from {u;} into another orthonormal basis is unitary 
(orthogonal). Conversely, if P = [a;] is a unitary (orthogonal) matrix, then the 
following is an orthonormal basis: 


/ Bg 
{u; = ayjUy + apy +---+a,u,:i=1,...,n} 


Recall that matrices A and B representing the same linear operator T are similar; that is, B = P~'AP, 
where P is the (nonsingular) change-of-basis matrix. On the other hand, if V is an inner product space, we 
are usually interested in the case when P is unitary (or orthogonal) as suggested by Theorem 13.9. (Recall 
that P is unitary if the conjugate tranpose P* = P~', and P is orthogonal if the transpose PT = P~'.) This 
leads to the following definition. 


DEFINITION: Complex matrices A and B are unitarily equivalent if there exists a unitary matrix P 
for which B = P*AP. Analogously, real matrices A and B are orthogonally equivalent 
if there exists an orthogonal matrix P for which B = PAP. 


Note that orthogonally equivalent matrices are necessarily congruent. 
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13.8 Positive Definite and Positive Operators 


Let P be a linear operator on an inner product space V. Then 


(i) P is said to be positive definite if P = S*S for some nonsingular operators S. 
(ii) P is said to be positive (or nonnegative or semidefinite) if P = S*S for some operator S. 


The following theorems give alternative characterizations of these operators. 


THEOREM 13.10A: The following conditions on an operator P are equivalent: 


(i) P = T? for some nonsingular self-adjoint operator T. 
(ii) P is positive definite. 
(iii) P is self-adjoint and (P(u),u) > 0 for every u Æ 0 in V. 


The corresponding theorem for positive operators (proved in Problem 13.21) follows. 


THEOREM 13.10B: The following conditions on an operator P are equivalent: 


(i) P = T? for some self-adjoint operator T. 
(ii) P is positive; that is, P = S*S. 
(iii) P is self-adjoint and (P(u),u) > 0 for every u € V. 


13.9 Diagonalization and Canonical Forms in Inner Product Spaces 


Let T be a linear operator on a finite-dimensional inner product space V over K. Representing T by a 
diagonal matrix depends upon the eigenvectors and eigenvalues of T, and hence, upon the roots of 
the characteristic polynomial A(t) of T. Now A(t) always factors into linear polynomials over the 
complex field C but may not have any linear polynomials over the real field R. Thus, the situation 
for real inner product spaces (sometimes called Euclidean spaces) is inherently different than the 
situation for complex inner product spaces (sometimes called unitary spaces). Thus, we treat them 
separately. 


Real Inner Product Spaces, Symmetric and Orthogonal Operators 
The following theorem (proved in Problem 13.14) holds. 


THEOREM 13.11: Let T be a symmetric (self-adjoint) operator on a real finite-dimensional product 
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of 
T; that is, T can be represented by a diagonal matrix relative to an orthonormal 
basis. 


We give the corresponding statement for matrices. 


THEOREM 13.11: (Alternative Form) Let A be a real symmetric matrix. Then there exists an 
orthogonal matrix P such that B = P~'4P = P'AP is diagonal. 


We can choose the columns of the above matrix P to be normalized orthogonal eigenvectors of A; then 
the diagonal entries of B are the corresponding eigenvalues. 

On the other hand, an orthogonal operator T need not be symmetric, and so it may not be represented 
by a diagonal matrix relative to an orthonormal matrix. However, such a matrix 7 does have a simple 
canonical representation, as described in the following theorem (proved in Problem 13.16). 
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THEOREM 13.12: Let T be an orthogonal operator on a real inner product space V. Then there exists 
an orthonormal basis of V in which T is represented by a block diagonal matrix M 
of the form 


M= diag(1, =i e el B = E 


sin}  cos0; sind,  cosð0, 


The reader may recognize that each of the 2 x 2 diagonal blocks represents a rotation in the 
corresponding two-dimensional subspace, and each diagonal entry —1 represents a reflection in the 
corresponding one-dimensional subspace. 


Complex Inner Product Spaces, Normal and Triangular Operators 


A linear operator T is said to be normal if it commutes with its adjoint—that is, if 77* = T*T. We note 
that normal operators include both self-adjoint and unitary operators. 

Analogously, a complex matrix A is said to be normal if it commutes with its conjugate transpose— 
that is, if AA* = A*A. 


f 1 «_|l —i 
EXAMPLE 13.3 Let A =| 3d a| Then =| w 
2 3-31 


= x i 
343i 14 | = A*A. Thus, A is normal. 


Also AA* = | 


The following theorem (proved in Problem 13.19) holds. 
THEOREM 13.13: Let T be a normal operator on a complex finite-dimensional inner product space V. 
Then there exists an orthonormal basis of V consisting of eigenvectors of T; that 
is, T can be represented by a diagonal matrix relative to an orthonormal basis. 


We give the corresponding statement for matrices. 


THEOREM 13.13: (Alternative Form) Let A be a normal matrix. Then there exists a unitary matrix 
P such that B = P7!4P = P*AP is diagonal. 


The following theorem (proved in Problem 13.20) shows that even nonnormal operators on unitary 
spaces have a relatively simple form. 


THEOREM 13.14: Let T be an arbitrary operator on a complex finite-dimensional inner product space 
V. Then T can be represented by a triangular matrix relative to an orthonormal 
basis of V. 

THEOREM 13.14: (Alternative Form) Let A be an arbitrary complex matrix. Then there exists a 
unitary matrix P such that B = P-'AP = P*AP is triangular. 


13.10 Spectral Theorem 


The Spectral Theorem is a reformulation of the diagonalization Theorems 13.11 and 13.13. 


THEOREM 13.15: (Spectral Theorem) Let T be a normal (symmetric) operator on a complex (real) 
finite-dimensional inner product space V. Then there exists linear operators 

E,,...,£, on V and scalars 1,,...,,. such that 

O T=4E +45 ++ +48, ii) ET =E} = E... E? = 


Gi) E+E + +E, =], (iv) EE; =0 fori #j. 
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The above linear operators E),...,£, are projections in the sense that E? = E;. Moreover, they are 
said to be orthogonal projections because they have the additional property that £;E; = 0 for i £ j. 

The following example shows the relationship between a diagonal matrix representation and the 
corresponding orthogonal projections. 


EXAMPLE 13.4 Consider the following diagonal matrices A, EF), E>, £3: 
2 1 0 0 
A — > E; —= 


The reader can verify that 


0) A=2E, +3E, +55, (ii) E+E, +E; =1, (ii) E? =E, (iv) EE =0 for iF). 


SOLVED PROBLEMS 
Adjoints 
13.1. Find the adjoint of F:R? — R? defined by 


F (x,y,z) = (3x + 4y — 5z, 2x — 6y + 7z, 5x —9y +z) 


First find the matrix A that represents F in the usual basis of R?—that is, the matrix A whose rows are 
the coefficients of x, y,z—and then form the transpose A’ of A. This yields 


3 4 -5 3 2 5 

A=|2 -6 7 andthen A=] 4 -6 -9 
5 —9 1 

The adjoint F* is represented by the transpose of A; hence, 


F* (x,y,z) = (3x + 2y + 5z, 4x — 6y — 9z, —5x + 7y + z) 


13.2. Find the adjoint of G:C? — C? defined by 
G(x, y,z) = [2x + (1 — i)y, (3 + 2i)x — 4iz, 2ix+ (4 — 3i)y — 3z] 


First find the matrix B that represents G in the usual basis of C?, and then form the conjugate transpose 
B* of B. This yields 


2 l—i 0 2 3—2i ~2i 
B= |3+2i 0 —4i and then B*= |l+i 0 4+3i 
2i 4-3i -3 0 4i —3 


Then G* (x,y,z) = [2x + (3 — 2i)y — 2iz, (1+ i)x + (4+ 3i)z, 4iy — 3z]. 


13.3. Prove Theorem 13.3: Let @ be a linear functional on an n-dimensional inner product space V. 
Then there exists a unique vector u € V such that (v) = (v, u) for every v € V. 


Let {w,,...,w,} be an orthonormal basis of V. Set 


u= b(wi)wı oe (wz) w2 Pee ae Pwa) Wn 


Let ù be the linear functional on V defined by û(v) = (v, u) for every v € V. Then, for i= 1,...,n, 


i(w;) = (wp u) = (wp POW) )Wy +--+ + O(W,)W,) = OW) 
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Because ù and ¢ agree on each basis vector, ù = ¢. 

Now suppose w’ is another vector in V for which ¢(v) = (v, u’) for every v € V. Then (v, u) = (v, u’) 
or (v, u— u’) = 0. In particular, this is true for v = u — u’, and so (u — u’, u—u') = 0. This yields 
u — u' = 0 and u = u’. Thus, such a vector u is unique, as claimed. 


13.4. Prove Theorem 13.1: Let T be a linear operator on an n-dimensional inner product space V. Then 


(a) There exists a unique linear operator T* on V such that 
(T(u), v) = (u,T*(v)) forall u,vEF. 


(b) Let A be the matrix that represents T relative to an orthonormal basis S = {u;}. Then the 
conjugate transpose A* of A represents 7* in the basis S. 


(a) We first define the mapping 7*. Let v be an arbitrary but fixed element of V. The map u + (7(u), v) 
is a linear functional on V. Hence, by Theorem 13.3, there exists a unique element v € V such 
that (T(u),v) = (u,v) for every weEV. We define T*:V =V by T*(v)=v'. Then 
(T(u), v} = (u, T*(v)) for every u, v € V. 

We next show that T* is linear. For any u, v; € V, and any a,b € K, 


(u, T¥(av, +bv)) = (T(u), av, + bw) =a(T(u), ) + b(T(u), v) 
a(u, T*(v1)) + blu, T*(v2)) = (u,aT*(v,) + bT*(v2)) 


But this is true for every u € V; hence, T* (av; + bv) = aT*(v,) + bT*(v,). Thus, T* is linear. 


(b) The matrices A = [a;] and B = [b;] that represent T and T*, respectively, relative to the orthonormal 
basis S are given by a; = (T(u;),u;) and b; = (T*(u;),u;) (Problem 13.67). Hence, 


by = (T* (u), u) = (u; T*(y)) = (Tu), u) = Gi 


Thus, B = A*, as claimed. 


13.5. Prove Theorem 13.2: 
O (4+T)*=T+TH, Git) (7, T))* = TTF, 
(ii) (kT)* = kT*, (iv) (T*)* =T. 
(i) For any u,vE€ V, 
((Ti + Ta)(u), v) = (Ti (u) + Th(u), v) = (Ti (u), v) + (T2(u), v) 
+ (u, Tž(v)} = (u, T¥(v) + T3(v)) 
T: 


The uniqueness of the adjoint implies (7; + T,)* = T¥ + T#. 
Gi) For any u,v € V, 
((kT)(u), v) = (kT (u), v) = k(T(u), v) = k(u, T*(v)) = (u, kT*(v)) = (u, (kT*)(v)) 
The uniqueness of the adjoint implies (kT)* = kT*. 
(iii) For any u,v E V, 
(TiD) (u), v) = (T, (T2(u)), 0) = (Tz(u), THe) 
= (u, T3(T¥(v))) = (u, (T3TŤ)(v)) 


The uniqueness of the adjoint implies (T, 7>)* = T#T77. 
(iv) For any u,v E V, 


(T*(u), v) = (v, T*(u)) = (T(v),u) = (u, T(v)) 


The uniqueness of the adjoint implies (7*)* = T. 
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13.6. 


13.7. 


13.8. 


13.9. 


Show that (a) Z* = ZI, and (b) 0*=0. 
(a) For every u,v € V, (I(u), v) = (u,v) = (u, I(v)}; hence, I* = J. 
(b) For every u,v € V, (O(u), v) = (0, v) = 0 = (u, 0) = (u, 0(v)); hence, 0* = 0. 


Suppose T is invertible. Show that (7~!)* = (T*) +}. 
I = I* = (TT!)* = (T~!)*T*; hence, (T~!)* = (T*)!. 


Let T be a linear operator on V, and let W be a T-invariant subspace of V. Show that W+ is 
invariant under 7*. 


Let u€ WŁ. If we W, then T(w) € W and so (w,T*(u)) = (T(w),u) =0. Thus, T*(u) € W+ 
because it is orthogonal to every w € W. Hence, W+ is invariant under 7%. 


Let T be a linear operator on V. Show that each of the following conditions implies T = 0: 


(i) (T(u), v) =0 for every u,v E€ V. 

(ii) V is a complex space, and (T(u), u) = 0 for every u € V. 
(iii) T is self-adjoint and (T(u),u) = 0 for every u € V. 
Give an example of an operator T on a real space V for which (7(u),u) = 0 for every u € V but T 40. 
[Thus, (ii) need not hold for a real space V.] 

(i) Set v= T(u). Then (T(u), T(u)) = 0, and hence, T(u) = 0, for every u € V. Accordingly, T = 0. 
(ii) By hypothesis, (T(v +w), v+ w) = 0 for any v,w € V. Expanding and setting (7(v), v} = 0 and 

(T(w),w) = 0, we find 
(T), w) + (T(w), v) = 0 (1) 


Note w is arbitrary in (1). Substituting iw for w, and using (7(v), iw) = i(T(v), w) = —i(T(v), w) and 
(T(iw), v) = (iT(w), v) = i(T(w), v), we find 


—i(T(v),w) + i(T(w), v) = 0 


Dividing through by i and adding to (1), we obtain (7(w), v} = 0 for any v, w, € V. By (i), T = 0. 
(iii) By (ii), the result holds for the complex case; hence we need only consider the real case. Expanding 
(T(vu+w), v+ w) =0, we again obtain (1). Because T is self-adjoint and as it is a real space, we 
have (T(w), v) = (w, T(v)) = (T(v), w). Substituting this into (1), we obtain (T (v), w) = 0 for any 
v,w E€ V. By Gi), T=0. 
For an example, consider the linear operator T on R? defined by T(x,y) = (y, —x). Then 
(T(u),u) = 0 for every u € V, but T £0. 


Orthogonal and Unitary Operators and Matrices 


13.10. Prove Theorem 13.6: The following conditions on an operator U are equivalent: 


(i) U* = U7; that is, U is unitary. (ii) (U(v),U(w)) = (uw). (aii) [Uw] = loll. 
Suppose (i) holds. Then, for every v, w, € V, 


(U(v), U(w)) = (v, U*U(w)) = (v, I(w)) = (v, w) 
Thus, (i) implies (ii). Now if (ii) holds, then 
UC) = V(U (a), Ue) = v, ») = loll 


Hence, (ii) implies (iii). It remains to show that (iii) implies (i). 
Suppose (iii) holds. Then for every v € V, 


(U*U(v)) = (U (v), U(a)) = (v, 0) = Ze), v) 


Hence, ((U*U —J)(v), v) =0 for every v € V. But U*U — I is self-adjoint (Prove!); then, by Problem 
13.9, we have U*U — I = 0 and so U*U = I. Thus, U* = U™!, as claimed. 
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13.11. 


13.12. 


Let U be a unitary (orthogonal) operator on V, and let W be a subspace invariant under U. Show 
that W+ is also invariant under U. 


Because U is nonsingular, U(W) = W; that is, for any w € W, there exists w’ € W such that 
U(w’) = w. Now let v € W+. Then, for any w € W, 


(Ula), w) = (Uo), UW) = (v,w) = 0 


Thus, U(v) belongs to WŁ. Therefore, W+ is invariant under U. 


Prove Theorem 13.9: The change-of-basis matrix from an orthonormal basis {u,,...,u,,} into 
another orthonormal basis is unitary (orthogonal). Conversely, if P = [a,] is a unitary (ortho- 
gonal) matrix, then the vectors u, = )_ apu; form an orthonormal basis. 
Suppose {v;} is another orthonormal basis and suppose 
v; = bau, + bpt +--+ +6,,u,, t=1,...,0 (1) 
Because {v;} is orthonormal, 


Oy = (v; v;) = baba + baba + agg DinDin (2) 
Let B = [bj] be the matrix of coefficients in (1). (Then B” is the change-of-basis matrix from {u;} to 
{v;}-) Then BB* = [c;;], where c; = bab; + bibja + +++ + binbin By (2), cy = 6;, and therefore BB* = 1. 
Accordingly, B, and hence, B7, is unitary. 

It remains to prove that {u;} is orthonormal. By Problem 13.67, 


ip? 


(ui, uy) = Ay jAyj F Aza H+ F Anian = (Ci, C)) 
where C; denotes the ith column of the unitary (orthogonal) matrix P = [aj]. Because P is unitary 


(orthogonal), its columns are orthonormal; hence, (uj, u;} = (C;, C;) = 6,. Thus, {u;} is an orthonormal basis. 


Symmetric Operators and Canonical Forms in Euclidean Spaces 


13.13. 


13.14. 


Let T be a symmetric operator. Show that (a) The characteristic polynomial A(t) of T is a 
product of linear polynomials (over R); (b) T has a nonzero eigenvector. 


(a) Let A be a matrix representing T relative to an orthonormal basis of V; then 4 = A’. Let A(t) be the 
characteristic polynomial of A. Viewing A as a complex self-adjoint operator, A has only real 
eigenvalues by Theorem 13.4. Thus, 


A(t) = (t = Ay)(t = 42) + (t= An) 


where the /, are all real. In other words, A(t) is a product of linear polynomials over R. 
(b) By (a), T has at least one (real) eigenvalue. Hence, T has a nonzero eigenvector. 


Prove Theorem 13.11: Let T be a symmetric operator on a real n-dimensional inner product 
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of 7. (Hence, T 
can be represented by a diagonal matrix relative to an orthonormal basis.) 


The proof is by induction on the dimension of V. If dim V = 1, the theorem trivially holds. Now 
suppose dim V = n > 1. By Problem 13.13, there exists a nonzero eigenvector v, of T. Let W be the space 
spanned by v, and let u, be a unit vector in W, e.g., let u = v,/||v,|]. 

Because v; is an eigenvector of T, the subspace W of V is invariant under T. By Problem 13.8, W+ is 
invariant under T* = T. Thus, the restriction Î of T to W+ is a symmetric operator. By Theorem 7.4, 
V = W WŁ. Hence, dim W+ = n — 1, because dim W = 1. By induction, there exists an orthonormal 
basis {u>,...,u,,} of W+ consisting of eigenvectors of T and hence of T. But (u,,u;) = 0 fori =2,...,n 
because u; € W+. Accordingly {u,,u,...,u,} is an orthonormal set and consists of eigenvectors of T. 
Thus, the theorem is proved. 
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13.15. 


13.16. 
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Let q(x, y) = 3x? — 6xy + 11)”. Find an orthonormal change of coordinates (linear substitution) 
that diagonalizes the quadratic form q. 


Find the symmetric matrix A representing q and its characteristic polynomial A(t). We have 
3 -3 2 2 
A=|_ and A(t) = tf —tr(A) t+ |A| = — 14t +24 = (t— 2)(t — 12) 


The eigenvalues are 1 = 2 and å = 12. Hence, a diagonal form of q is 
q(s,t) = 2s" + 12° 


(where we use s and f as new variables). The corresponding orthogonal change of coordinates is obtained 
by finding an orthogonal set of eigenvectors of A. 
Subtract 2 = 2 down the diagonal of A to obtain the matrix 


1 -3 
—3 9 


x—3y=0 


| corresponding to -3x+9% = 0 


m= o  x—3y=0 


A nonzero solution is uw; = (3,1). Next subtract 2 = 12 down the diagonal of A to obtain the matrix 


=9 =3 
-3 -l1 


: —9x — 3y = 0 _ 
| corresponding to ayes y=0 or 3x-y=0 


A nonzero solution is u = (—1,3). Normalize uw, and u, to obtain the orthonormal basis 
ti, = (3/V10, 1/v10), û, = (—1/V10, 3/V10) 


Now let P be the matrix whose columns are i, and û,. Then 


p—|3/Vv10 -1/v10 
= [1/vVI0 3/vT0 


Thus, the required orthogonal change of coordinates is 


=| | oi eo? got 


2 0 
and D = P'AP = PAP | | 
0 12 


y t V10 ’ v10 
One can also express s and ¢ in terms of x and y by using P~! = P”; that is, 
3x +y pa +3y 
s= ; = 
v10 v10 


Prove Theorem 13.12: Let T be an orthogonal operator on a real inner product space V. Then 
there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of 
the form 


cos} —sin0, | e 0, —sin d ) 


M =diag(1,... 1-1-1 Eo cos 0; sin, cos, 


Let S = T + T7! = T + T*. Then S* = (T + T*)* = T* + T = S. Thus, S is a symmetric operator 
on V. By Theorem 13.11, there exists an orthonormal basis of V consisting of eigenvectors of S. If 
A,,-+-,4m denote the distinct eigenvalues of S, then V can be decomposed into the direct sum 


V =V, 9V ®--- Vp, where the V, consists of the eigenvectors of S belonging to 4;. We claim that 
each V; is invariant under T. For suppose v € V; then S(v) = 4;v and 


S(T(v)) = (T + 7')T(v) = T(T + T!) (v) = TS(v) = T(Ajv) = 4T (w) 


That is, T(v) € V;. Hence, V; is invariant under T. Because the V, are orthogonal to each other, we can 
restrict our investigation to the way that T acts on each individual V. 
On a given V,, we have (T + T~!)v = S(v) = A,v. Multiplying by T, we get 


(T? -4T +I) (uy) =o (1) 
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We consider the cases 4; = +2 and A; A +2 separately. If A; = +2, then (T + D (w) = 0, which leads to 
(T £1)(v) = 0 or T(v) = +v. Thus, T restricted to this V; is either Z or —J. 

If 2; A +2, then T has no eigenvectors in V;, because, by Theorem 13.4, the only eigenvalues of T are 
1 or —1. Accordingly, for v Æ 0, the vectors v and 7(v) are linearly independent. Let W be the subspace 
spanned by v and T(v). Then W is invariant under T, because using (1) we get 


T(T(v)) = T? (v) =4,T(v) —v EW 


By Theorem 7.4, V; = W @ W+. Furthermore, by Problem 13.8, W+ is also invariant under T. Thus, we 
can decompose V; into the direct sum of two-dimensional subspaces W, where the W, are orthogonal to 
each other and each W, is invariant under T. Thus, we can restrict our investigation to the way in which T 
acts on each individual W;. 

Because 7*—1,T+1=0, the characteristic polynomial A(*) of T acting on W, is 
A(t) =f —A,t+1. Thus, the determinant of T is 1, the constant term in A(t). By Theorem 2.7, the 
matrix A representing T acting on W, relative to any orthogonal basis of W, must be of the form 


cos@ —sin@ 
sin 0 cos 0 


The union of the bases of the W; gives an orthonormal basis of V;, and the union of the bases of the V; gives 


an orthonormal basis of V in which the matrix representing T is of the desired form. 


Normal Operators and Canonical Forms in Unitary Spaces 


13.17. 


13.18. 


Determine which of the following matrices is normal: 


1 i I 4 
(a) Fei 1. sal r 


o wh JDE a eeh de EH i 


Because AA* 4 A*A, the matrix 4 is not normal. 


1 i 1 1 2 2+2i 1 1 1 i 
* = — = * 
o aji spilla ane lata 6 |= [ee atli oss) =2% 


Because BB* = B*B, the matrix B is normal. 


Let T be a normal operator. Prove the following: 
(a) T(v) = 0 if and only if T*(v) =0. (b) T — AI is normal. 
(c) If T(v) = Av, then T*(v) = Av; hence, any eigenvector of T is also an eigenvector of T*. 


(d) If T(v) = A4,v and T(w) = Aw where 1, # 22, then (v,w) = 0; that is, eigenvectors of T 
belonging to distinct eigenvalues are orthogonal. 


(a) We show that (7(v),7(v)) = (T* (v), T*(v)): 
(T(v), T(v)) = (v, T*T(v)) = (v, TT*(v)) = (T*(v), T*(v)) 
Hence, by [Z] in the definition of the inner product in Section 7.2, T(v) = 0 if and only if T*(v) = 0. 
(b) We show that T — AI commutes with its adjoint: 
(T —AD(T —AD* = (T — AN T* — A1) = TT* — AT* — AT + 221 
= T*T — AT —AT* + ŽI = (T* — A1)(T — Ad) 
= (T-AN*(T — A) 


Thus, T — ÀI is normal. 
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13.20. 


(c) If T(v)=Av, then (T—Al)(v) =0. Now T—AI is normal by (b); therefore, by (a), 
(T — AI)* (v) = 0. That is, (T* — 21) (v) = 0; hence, T*(v) = Av. 
(d) We show that 4, (v, w) = (v, w): 
(v, w) = (210, w) = (T(v),w) = (v, T*(w)) = (v, 3w) = 4 (v, w) 
But 2, Æ 4%; hence, (v, w) = 0. 


Prove Theorem 13.13: Let T be a normal operator on a complex finite-dimensional inner product 
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Thus, T 
can be represented by a diagonal matrix relative to an orthonormal basis.) 


The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now 
suppose dim V = n > 1. Because V is a complex vector space, T has at least one eigenvalue and hence a 
nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u, be a unit vector in W. 

Because v is an eigenvector of T, the subspace W is invariant under T. However, v is also an 
eigenvector of T* by Problem 13.18; hence, W is also invariant under 7*. By Problem 13.8, W+ is 
invariant under 7** = T. The remainder of the proof is identical with the latter part of the proof of 
Theorem 13.11 (Problem 13.14). 


Prove Theorem 13.14: Let T be any operator on a complex finite-dimensional inner product 
space V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V. 


The proof is by induction on the dimension of V. If dim V = 1, then the theorem trivially holds. Now 
suppose dim V =n > 1. Because V is a complex vector space, T has at least one eigenvalue and hence at 
least one nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u, be a unit vector in W. 
Then u, is an eigenvector of T and, say, T(u,) = ajju. 

By Theorem 7.4, V = W @ W+. Let E denote the orthogonal projection V into W+. Clearly W+ is 
invariant under the operator ET. By induction, there exists an orthonormal basis {u,...,u,,} of W+ such 
that, fori = 2,...,n, 

ET (uj) = apt +3 U3 + +++ + Ajit; 


(Note that {u , u2, . - . , u, } is an orthonormal basis of V.) But E is the orthogonal projection of V onto W+; 
hence, we must have 


T (uj) = agu + apu +++ + ail; 


for i =2,...,n. This with T(u,) = a,,u, gives us the desired result. 


Miscellaneous Problems 


13.21. 


Prove Theorem 13.10B: The following are equivalent: 
(i) P = T? for some self-adjoint operator T. 

(ii) P = S*S for some operator S; that is, P is positive. 

(iii) P is self-adjoint and (P(u), u) > 0 for every u € V. 


Suppose (i) holds; that is, P = T? where T = T*. Then P = TT = T*T, and so (i) implies (ii). Now 
suppose (ii) holds. Then P* = (S*S)* = S*S** = S*S = P, and so P is self-adjoint. Furthermore, 


(P(u), u) = (S*S(u),u) = (S(u), S(u)) > 0 


Thus, (ii) implies (iii), and so it remains to prove that (iii) implies (i). 

Now suppose (iii) holds. Because P is self-adjoint, there exists an orthonormal basis {u,,...,u,,} of V 
consisting of eigenvectors of P; say, P(u;) = A,u;. By Theorem 13.4, the 4, are real. Using (iii), we show 
that the 2; are nonnegative. We have, for each i, 


0 < (P(u;), u;) = (Aju, uj) = A;(uj, Ui) 


jase E Saas | fe iy! 


Thus, (u;,u;) > 0 forces 4; > 0, as claimed. Accordingly, Vii is a real number. Let T be the linear 


hata | 


operator defined by 
T(u;) = Au; for i=1,...,n 
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13.22. 


13.23. 


Because T is represented by a real diagonal matrix relative to the orthonormal basis {u;}, T is self-adjoint. 
Moreover, for each i, 


= T(m) = VAT (is) = Vai hit = hu; = P(u;) 


Because T? and P agree on a basis of V,P = T. Thus, the theorem is proved. 


Remark: The above operator T is the unique positive operator such that P = 7”; it is called the 
positive square root of P. 


Show that any operator 7 is the sum of a self-adjoint operator and a skew-adjoint operator. 


Set S =} (T + T*) and U =} (T — T*). Then T = S + U, where 


s*= 
U* = 


(T+ T*)]* =1(T* + T**) =}(T*+T) =S 


zl 1 
d BE. 2 
= (T =- T*]* =1(T*-T)=-}(T- T*) = -U 


NIF NI 


that is, S is self-adjoint and U is skew-adjoint. 


Prove: Let T be an arbitrary linear operator on a finite-dimensional inner product space V. Then 
T is a product of a unitary (orthogonal) operator U and a unique positive operator P; that is, 
T = UP. Furthermore, if T is invertible, then U is also uniquely determined. 


By Theorem 13.10, T*T is a positive operator; hence, there exists a (unique) positive operator P such 
that P? = T*T (Problem 13.43). Observe that 


P(x)? = (P(e), PO) = (P(o), o) = (T*T (0), v) = (T), T) = ITIP? (1) 


We now consider separately the cases when T is invertible and noninvertible. 
If T is invertible, then we set U = PT~!. We show that U is unitary: 


Û* =(PT')*=T *p*=(T*)'P and = Ux =(T*) 'ppr-! =(T*) 'T*ITT! = I 


Thus, U is unitary. We next set U = U-!. Then U is also unitary, and T = UP as required. 
To prove uniqueness, we assume T = U)Po, where Up is unitary and P, is positive. Then 


T*T = P&URUpPo = PolPy) = PR 


But the positive square root of T*T is unique (Problem 13.43); hence, Po = P. (Note that the invertibility 
of T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also invertible by (1). 
Multiplying U)P = UP on the right by P~! yields U) = U. Thus, U is also unique when T is invertible. 

Now suppose T is not invertible. Let W be the image of P; that is, W = Im P. We define U;:W — V by 


U,(w) = T(v), where P(v) =w (2) 


We must show that U, is well defined; that is, that P(v) = P(v’) implies T(v) = T(v’). This follows from 
the fact that P(v — v') = 0 is equivalent to ||P(v — v’)|| = 0, which forces ||T(v — v’)|| = 0 by (1). Thus, 
U; is well defined. We next define U,:W — V. Note that, by (1), P and T have the same kernels. Hence, the 
images of P and T have the same dimension; that is, dim(Im P) = dim W = dim(Im T). Pai 
WŁ and (Im T a also have the same dimension. We let U, be any isomorphism between W+ and (Im T) 

We next set U = U, © U). [Here U is defined as follows: If v € V and v = w + w', where w € W, 
w € WŁ, then U (v) = U, (w) + U,(w’).] Now U is linear (Problem 13.69), and, if v € V and P(v) = w 
then, by (2), 


T(v) = U\(w) = U(w) = UP(v) 


Thus, T = UP, as required. 
It remains to show that U is unitary. Now every vector x € V can be written in the form x = P(v) + w’, 
where w € WŁ. Then U(x) = UP(v) + U>(w’) = T(v) + U,(w’), where (T(v), U>(w’)) = 0 by definition 
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13.24. 


of U,. Also, (T(v), T(v)) = (P(v), P(v)) by (1). Thus, 


(U(x), U(x)) = (T(v) + Ua(w), T) + Uz(w')) = (T(v), T(v)) + (Uaw), Ur(w’)) 
= (P(v),P(v)) + (Ww) = (Pœ) +w, P(v) +w) = (x, x) 


[We also used the fact that (P(v), w’) = 0.] Thus, U is unitary, and the theorem is proved. 
Let V be the vector space of polynomials over R with inner product defined by 
1 
(fos) =| foto at 
0 


Give an example of a linear functional @ on V for which Theorem 13.3 does not hold—that is, 
for which there is no polynomial A(t) such that ¢( f) = (f,h) for every f € V. 


Let ¢:V — R be defined by $( f) =f (0); that is, @ evaluates f (t) at 0, and hence maps f (t) into its 
constant term. Suppose a polynomial A(t) exists for which 


1 
oN) =£0) = | Ponte a (1 
for every polynomial f(t). Observe that pġ maps the polynomial tf (t) into 0; hence, by (1), 


| tf (t)h(t) dt = 0 (2) 


0 


for every polynomial f(t). In particular (2) must hold for f(t) = th(t); that is, 


1 
| Ph (t) dt =0 
0 


This integral forces h(t) to be the zero polynomial; hence, $(f) = (f,h) =(f,0) =0 for every 
polynomial f(t). This contradicts the fact that @ is not the zero functional; hence, the polynomial A(t) 
does not exist. 


SUPPLEMENTARY PROBLEMS 


Adjoint Operators 


13.25. 


13.26. 


13.27. 


13.28. 


13.29. 


13.30. 


Find the adjoint of: 

_Îs=2i 347 Oo f3 5i fa 
(a) e oil (b) iel E) (©) cae ] 
Let T:R? — R? be defined by T(x,y,z) = (x + 2y, 3x—4z, y). Find T*(x,y,z). 


Let T:C’ — C? be defined by T(x, y,z) = [ix + (2+ 3iy, 3x+ (3 — i)z, (2-— 5i)y + iz]. 
Find T*(x,y,z). 
For each linear function ¢ on V, find u € V such that @(v) = (v, u) for every v € V: 


(a) p:R? — R defined by $(x,y,z) = x + 2y — 3z. 
(b) $:C3 — C defined by (x,y,z) = ix + (2 + 3i)y + (1 — 2i)z. 


Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the kernel of 
T; that is, Im T* = (Ker T)". Hence, rank(T) = rank(T*). 


Show that T*T = 0 implies T = 0. 
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13.31. 


Let V be the vector space of polynomials over R with inner product defined by ( f, g) = I FT (t)g(t) dt. Let 
D be the derivative operator on V; that is, D( f) = df /dt. Show that there is no operator D* on V such that 
(D( f),g) = (f, D*(g)) for every f,g € V. That is, D has no adjoint. 


Unitary and Orthogonal Operators and Matrices 


13.32. 


13.33. 


13.34. 


13.35. 


13.36. 


13.37. 


13.38. 


13.39. 


Find a unitary (orthogonal) matrix whose first row is 
(a) (2/13, 3/13), (b) a multiple of (1, 1—i), (c) a multiple of (1,—i, 1 — i). 


Prove that the products and inverses of orthogonal matrices are orthogonal. (Thus, the orthogonal matrices 
form a group under multiplication, called the orthogonal group.) 


Prove that the products and inverses of unitary matrices are unitary. (Thus, the unitary matrices form a 
group under multiplication, called the unitary group.) 


Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal. 


Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix P such that 
B = P*AP. Show that this relation is an equivalence relation. 


Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P such 
that B = P'AP. Show that this relation is an equivalence relation. 


Let W be a subspace of V. For any v € V, let v = w+w, where w € W, w’ € W+. (Such a sum is unique 
because V = W @ WŁ.) Let T:V — V be defined by T(v) = w — w. Show that T is self-adjoint unitary 
operator on V. 


Let V be an inner product space, and suppose U:V — V (not assumed linear) is surjective (onto) and 
preserves inner products; that is, (U (v), U(w)) = (u, w) for every v,w € V. Prove that U is linear and 
hence unitary. 


Positive and Positive Definite Operators 


13.40. 


13.41. 


13.42. 


13.43. 


13.44. 


13.45. 


13.46. 


Show that the sum of two positive (positive definite) operators is positive (positive definite). 


Let T be a linear operator on V and let f:V x V — K be defined by f (u, v) = (T (u), v). Show that f is an 
inner product on V if and only if T is positive definite. 


Suppose Æ is an orthogonal projection onto some subspace W of V. Prove that kI + E is positive (positive 
definite) if k > 0 (k > 0). 


Consider the operator T defined by T(u;) = ./4,u;,i = 1,...,n, in the proof of Theorem 13.10A. Show 
that T is positive and that it is the only positive operator for which T? = P. 


Suppose P is both positive and unitary. Prove that P = /. 


Determine which of the following matrices are positive (positive definite): 
(i) 1 1 (ii) 0 i (ii) 0 1 1 1 2 1 . fl 2 
Dr rp My of Wla of lo ip Mir ap Mh a 


A is positive if and only if (i) A = A*, and (ii) a, d and 


Prove that a 2 x 2 complex matrix A = f d 


|A| = ad — bc are nonnegative real numbers. 
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13.47. 


Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is a 
nonnegative (positive) real number. 


Self-adjoint and Symmetric Matrices 


13.48. 


13.49. 


13.50. 


13.51. 


13.52. 


13.53. 


For any operator T, show that T + T* is self-adjoint and T — 7* is skew-adjoint. 


Suppose T is self-adjoint. Show that T? (v) = 0 implies T(v) = 0. Using this to prove that T” (v) = 0 also 
implies that T(v) = 0 for n > 0. 


Let V be a complex inner product space. Suppose (T (v), v) is real for every v € V. Show that T is self- 
adjoint. 


Suppose 7, and T, are self-adjoint. Show that 7; T, is self-adjoint if and only if T) and 7, commute; that is, 
Tı T, =] Ty Ti . 


For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such 
that PAP is diagonal: 


1 2 5 4 7 3 
@) ‘=l; > (b) ‘=l; ah (0) a=; E 


Find an orthogonal change of coordinates X = PX’ that diagonalizes each of the following quadratic forms 
and find the corresponding diagonal quadratic form q(x’): 


(a) g(x,y) = 2x? — Oxy + 10°, (b) g(x,y) =x? + Bxy — Sy’ 
(c) q(x, y, z) = 2x? — 4xy + Sy? + 2xz — 4yz + 22? 


Normal Operators and Matrices 


13.54. 


13.55. 


13.56. 


13.57. 


13.58. 


13.59. 


13.60. 


13.61. 


13.62. 


Let A = p J . Verify that A is normal. Find a unitary matrix P such that P*AP is diagonal. Find P*4P. 
Show that a triangular matrix is normal if and only if it is diagonal. 


Prove that if T is normal on V, then ||T(v)|| = ||T*(v)|| for every v € V. Prove that the converse holds in 
complex inner product spaces. 


Show that self-adjoint, skew-adjoint, and unitary (orthogonal) operators are normal. 


Suppose T is normal. Prove that 


(a) T is self-adjoint if and only if its eigenvalues are real. 
(b) T is unitary if and only if its eigenvalues have absolute value 1. 


(c) T is positive if and only if its eigenvalues are nonnegative real numbers. 

Show that if T is normal, then T and T* have the same kernel and the same image. 

Suppose 7, and T, are normal and commute. Show that 7, + T, and TT, are also normal. 

Suppose T, is normal and commutes with T}. Show that 7, also commutes with T¥. 

Prove the following: Let T) and T, be normal operators on a complex finite-dimensional vector space V. 


Then there exists an orthonormal basis of V consisting of eigenvectors of both 7, and 7,. (That is, 7; and 
T, can be simultaneously diagonalized.) 


Isomorphism Problems for Inner Product Spaces 


13.63. 


Let S = {u,,...,u,} be an orthonormal basis of an inner product space V over K. Show that the mapping 
vt [v], is an (inner product space) isomorphism between V and K”. (Here [v], denotes the coordinate 
vector of v in the basis S.) 
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13.64. 


13.65. 


13.66. 


Show that inner product spaces V and W over K are isomorphic if and only if V and W have the same 
dimension. 


Suppose {u,,...,u,} and {u}, ..., uh} are orthonormal bases of V and W, respectively. Let T:V — W be 


yen 


the linear map defined by T(u;) = u; for each i. Show that T is an isomorphism. 


Let V be an inner product space. Recall that each u € V determines a linear functional ù in the dual space 
V* by the definition û(v) = (v, u) for every v € V. (See the text immediately preceding Theorem 13.3.) 
Show that the map u — 7 is linear and nonsingular, and hence an isomorphism from V onto V*. 


Miscellaneous Problems 


13.67. 


13.68. 


13.69. 


Suppose {u,, ..., u,} is an orthonormal basis of V. Prove 


(a) (ayy + agit, +++ + aptin, biu + byu + +++ + bpn) = ay, + aby +... G yb, 
(b) Let A = [a;] be the matrix representing T: V — V in the basis {u;}. Then a; = (T (u;), u;)- 


Show that there exists an orthonormal basis {u,,...,u,,} of V consisting of eigenvectors of T if and only if 
there exist orthogonal projections £,,...,£, and scalars 4,,...,4, such that 

O T=4E +---+4,£, Gi) EF, +--+ +E, =I, (i) FA, =0 for iF] 
Suppose V = U @ W and suppose 7,:U — V and T,:W — V are linear. Show that T = T, @ T, is also 


linear. Here T is defined as follows: If v € V and v = u + w where u € U, w € W, then 


T(v) = Ti(u) + Tw) 


ANSWERS TO SUPPLEMENTARY PROBLEMS 


Notation: [R,; Ry; ...; R,| denotes a matrix with rows R}, R3,..., Rp 
13.25. (a) [5+2i, 4+6; 3-7i, 8-3], Œ) [,-i; —5i,2i, (©) [1,2; 1,3] 
13.26. T*(x,y,z) = (x+3y, 2x+z, —4y) 
13.27. T*(x,y,z) = [ix + 3y, (2-— 3i)x + (2+ 5i)z, (3 + i)y — iz] 
13.28. (a) w=(1,2,-3), (b) u=(-i, 2—3i, 1+2i) 
13.32. (a) (1/V13)[2,3; 3,-2), (b) (1/V3)[1, 1-é 1+i -1], 
(c) H, =i 1-# v2i, -v2, 0; 1, -i, -1+ 


13.45. 


13.52. 


13.53. 


13.54. 


Only (i) and (v) are positive. Only (v) is positive definite. 


(aandb) P=(1/V/5)[2,-1; 1,2], () P=(1/V10)[3,—-1; 1,3] 

(a) D=(2,0; 0,-3],  (b) D=[7,0; 0,-3], (© D=[8,0; 0,-2] 
(a) x= (3x'-y)/V10, y= (x +3y)/V10; (b) x= (2x — y) /v5, y= (x + 2y)/V5; 
(c) x=x/V3+y/V2+27/V6, y=x'/V3 — 27 / V6, z= x / V3 — y'/ V2 +27 /v6; 
(a) q(x) = diag(1,11); (b) g(x’) = diag(3,—7); (© q(x) = diag(1, 17) 


(a) P=(1/V2)[1,-1; 1,1], P*AP = diag(2 +i, 2-— i) 


APPENDIX A 


Multilinear Products 


A.1 Introduction 


The material in this appendix is much more abstract than that which has previously appeared. Accordingly, 
many of the proofs will be omitted. Also, we motivate the material with the following observation. 
Let S be a basis of a vector space V. Theorem 5.2 may be restated as follows. 


THEOREM 5.2: Let g:S — V be the inclusion map of the basis S into V. Then, for any vector space 
U and any mapping f :S — U, there exists a unique linear mapping f* : V — U such 
that f = f*- g. 


Another way to state the fact that f = f*- g is that the diagram in Fig. A-1(a) commutes. 


V T=V@W E=NV 
N N N 
N N N 
g 7 g i g Sv 
N N ON 
“a “a : `~ 
$—U Vx W — >U y= >U 
f S f 
(a) (b) (c) 
Figure A-1 


A.2 Bilinear Mapping and Tensor Products 


Let U, V, W be vector spaces over a field K. Consider a map 
f:VxWaU 


Then fis said to be bilinear if, for each v € V, the map f,:W — U defined by f,(w) = f (v, w) is linear; 
and, for each w € W, the map f,,:V — U defined by f,,(v) = f (v, w) is linear. 

That is, fis linear in each of its two variables. Note that fis similar to a bilinear form except that the 
values of the map fare in a vector space U rather than the field K. 


DEFINITION A.1: Let V and W be vector spaces over the same field K. The tensor product of V and 
W is a vector space T over K together with a bilinear map g: V x W >T, 
denoted by g(v, w) = v & w, with the following property: (*) For any vector 
space U over K and any bilinear map f : V x W — U there exists a unique linear 
map f*:T — U such that f*- g =f. 


The tensor product (T, g) [or simply T when g is understood] of V and Wis denoted by V ® W, and the 
element v © w is called the tensor of v and w. 

Another way to state condition (*) is that the diagram in Fig. A-1(b) commutes. The fact that such 
a unique linear map f* exists is called the ‘‘Universal Mapping Principle’? (UMP). As illustrated in 
Fig. A-1(b), condition (*) also says that any bilinear map f:V x W — U ‘“‘factors through’’ the tensor 
product T = V & W. The uniqueness in (*) implies that the image of g spans T, that is, span ({v 8 w}) = T. 


«>—_ 
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THEOREM A.1: (Uniqueness of Tensor Products) Let (T, g) and (T', g’) be tensor products of V 
and W. Then there exists a unique isomorphism A:T — T’ such that hg = g’. 


Proof. Because T is a tensor product, and g’: V & W — T' is bilinear, there exists a unique linear map 
h:T — T' such that hg = g’. Similarly, because T’ is a tensor product, and g:V @ W — T' is bilinear, 
there exists a unique linear map h’:7' — T such that h'g’ = g. Using hg = g’, we get h'hg = g. Also, 
because T is a tensor product, and g: V © W — T is bilinear, there exists a unique linear map h*:T — T 
such that h*g = g. But lyg =g. Thus, Wh = h* =1,. Similarly, hh’ = ly. Therefore, A is an 
isomorphism from T to T”. 


THEOREM A.2: (Existence of Tensor Product) The tensor product T = V & W of vector spaces V 
and W over K exists. Let {v,,..., U„} be a basis of V and let {w,,..., w,} be a 
basis of W. Then the mn vectors 


vw; (=1,...,mj=1,...,n) 
form a basis of T. Thus, dim T = mn = (dim V)(dim W). 
Outline of Proof. Suppose {u,. oy Um} is a basis of V, and suppose {w,, ..., w„} is a basis of W. 
Consider the mn symbols {t;li = i,..., m, j= 1, ..., n}. Let T be the vector space generated by the f,. 


That is, T consists of all linear éormbinations of the ty with coefficients in K. [See Problem 4.137.] 
Let v€ V and w € W. Say 
V = aV, Hav +: H Anm and w= byw; + bow +---+b,W, 


Let g:V x W — T be defined by 
w) = Ds 2 a;b;tij 


Then g is bilinear. [Proof left to reader.] 

Now let f: V x W — U be bilinear. Because the ¢;; form a basis of T, Theorem 5.2 (stated above) tells 
us that there exists a unique linear map f*:T — U such that f* (t; ii) =f (v Ui, w;). Then, for v = X` ajv; and 
w= 2 b;w;, we have i 


“(Sam Dm) = > 9 abit (vn w, ; = Do Danby ty t; =f (elv, w)). 


Therefore, f = f*g where f* is the required map in Definition A.1. Thus, T is a tensor product. 
Let {vu}, ..., V} be any basis of V and {w}, ..., w} be any basis of W. 
Let v € V and w € W and say 


T j 1 / / / / / 
v=ayvy,t+---+a,v, and w=b,w,+---+5w,, 


Then 


v& w = g(v, w) =A A gl E A 


Thus, the elements v, ® w, span T. There are mn such elements. They cannot be linearly dependent 
because {ty jis a basis of T, and hence, dim T = mn. Thus, the v; & w form a basis of T. 


Next we give two concrete examples of tensor products. 


EXAMPLE A.1 Let V be the vector space of polynomials P,_,(x) and let W be the vector space of polynomials 
P,_,(y). Thus, the following from bases of V and W, respectively, 

1, x, xX?,..., XT] and ieee 
In particular, dim V = r and dim W = s. Let T be the vector space of polynomials in variables x and y 
with basis 


{x'y/} where i=0,1,...,r—1;j=0,1,...,5s—1 
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Then T is the tensor product V © W under the mapping 
x @y = ry 
For example, suppose v = 2 — 5x + 3x? and w = 7y + 4y?. Then 
v@w= 14y + 8y? — 35xy — 20x7? + 21x°y + 12° 
Note, dim T = rs = (dim V)(dim W). 


EXAMPLE A.2 

Let V be the vector space of m x n matrices over a field K and let W be the vector space of p x q matrices 
over K. Suppose A = [|a;;] belongs to V, and B belongs to W. Let T be the vector space of mp x nq 
matrices over K. Then T is the tensor product of V and W where A © B is the block matrix 


a,,B A,B Fee a,B 
AQ@B= [a,B| — | 4B aB >  dyB 
ant B AmB amnB 


1 2 3 2 4 6 
4 5 6 8 10 12 
3 6 9 4 8 12 
12 15 18 16 20 24 


Isomorphisms of Tensor Products 


First we note that tensoring is associative in a cannonical way. Namely, 


THEOREM A.3: Let U, V, W be vector spaces over a field K. Then there exists a unique isomorphism 
(U@V)Q@WaUR(VQW) 
such that, for every u € U, vEV, we W, 
(u®v)@wre u® (v@w) 


Accordingly, we may omit parenthesis when tensoring any number of factors. Specifically, given 
vectors spaces Vi, V>, ..., Vp over a field K, we may unambiguously form their tensor product 


V,@V,@...@V,, 
and, for vectors v; in V;, we may unambiguously form the tensor product 
UV; @ U2 @...8B Uy, 


Moreover, given a vector space V over K, we may unambiguously define the following tensor 
product: 


@'V=V8V®...®V (r factors) 

Also, there is a canonical isomorphism 
(@"V) 8 (8V) — BV 

Furthermore, viewing K as a vector space over itself, we have the canonical isomorphism 
KeVaV 


where we define a & v = av. 
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A.3 Alternating Multilinear Maps 


Let f: V" — U where V and U are vector spaces over K. [Recall V" = V x V x... x V, r factors.] 


(1) The mapping f is said to be multilinear or r-linear if f (v4, ..., v,) is linear as a function of each v; 
when the other v;’s are held fixed. That is, 


Fess; vj + Ui, eD SSe We xox) HSE tyas) 
Sl Ry i) ag is) 
where only the jth position changes. 
(2) The mapping f is said to be alternating if 

J (vi, +++ v.) = 0 whenever v; = v; with i # j 

One can easily show (Prove!) that if fis an alternating multilinear mapping on V”, then 
fees Uj, ngs Uj sa) = HF (ase Wiyasa Uj sac) 

That is, if two of the vectors are interchanged, then the associated value changes sign. 


EXAMPLE A.3 (Determinants) 
The determinant function D:M — K on the space M of n x n matrices may be viewed as an n-variable function 


D(A) = D(R,, Rz, ---, Rn) 
defined on the rows R,, Ry, ..., R, of A. Recall (Chapter 8) that, in this context, D is both n-linear and alternating. 


We now need some additional notation. Let K = [k,, ky, ..., k,] denote an r-list (r-tuple) of elements 
from Z, = (1, 2, ..., n). We will then use the following notation where the v,’s denote vectors and the 
aip s denote scalars: 


UK = (Uks Ups ness Uz) and ak = Aih Ar, «++ Ark, 


r 


Note vg is a list of r vectors, and ax is a product of r scalars. 

Now suppose the elements in K = [k,, ky, ..., k,] are distinct. Then K is a permutation ox of an r-list 
J = |i, in, ..., i] in standard form, that is, where i, < i, < ... < i,. The number of such standard-form 
r-lists J from Z, is the binomial coefficient: 


[Recall sign(o,) = (—1)"* where mx is the number of interchanges that transforms K into J.] 


Now suppose A = |a,;| is an r x n matrix. For a given ordered r-list J, we define 


ii Ni, ay; 
A2; hi Aii, 

D,(A) 1 2 y 
a a a 


That is, D;(A) is the determinant of the r x r submatrix of A whose column subscripts belong to J. 
Our main theorem below uses the following ‘‘shuffling’’ lemma. 


LEMMA A.4 Let V and U be vector spaces over K, and let f:V" — U be an alternating r-linear 
mapping. Let v,, v2, ..., uv, be vectors in V and let A = [a,j be an r x n matrix over K 
where r < n. For i = 1, 2,..., r, let 


Uj = Ay Vi + aiU F +++ + Ain Uy 
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Then 
f(u, erg u,) = XO DAY (wis Vins iai v;,) 
f 
where the sum is over all standard-form r-lists J = {i,, in, ..., i,}. 


The proof is technical but straightforward. The linearity of f gives us the sum 
Ff (uy, +5 Up) = D> af (0x) 
K 


where the sum is over all r-lists K from {1, ..., n}. The alternating property of f tells us that f (vg) = 0 
when K does not contain distinct integers. The proof now mainly uses the fact that as we interchange the 
vs to transform 


f (ux) =f (Vhs Uk,» reya Uy.) to f(y) =f (v Uis Sra v;,) 


so that i, <---<i,, the associated sign of ag, will change in the same way as the sign of the 
corresponding permutation og, changes when it is transformed to the identity permutation using 
transpositions. 

We illustrate the lemma below for r = 2 and n = 3. 


EXAMPLE A.4 Suppose f:V? — U is an alternating multilinear function. Let v,, v>, v3 € V and let u, w € V. 
Suppose 
U = av + azv + azv, and w = bı v; + bav + b303 
Consider 
f(u,w) =f (av + av + 4303, bivi + bzu + bzu) 
Using multilinearity, we get nine terms: 
f(u,w) = aibi f(r, v,) + aba f(v, U2) + abs f(v, 03) 
+ abı f (v2, v1) + andy f (v2, Uz) + azb f (v2, U3) 
+ a3b, f (v3, v1) + azb f (v3, Uz) + azb; f (v3, v3) 


(Note that J = [1,2], J’ = [1,3] and J” = [2,3] are the three standard-form 2-lists of 7 = [1,2,3].) The 
alternating property of f tells us that each f(v;, v;) = 0; hence, three of the above nine terms are equal to 
0. The alternating property also tells us that f (v;, ve) = —f (up, v,). Thus, three of the terms can be 
transformed so their subscripts form a standard-form 2-list by a single interchange. Finally we obtain 


f (u,w) = (aiba — abi) f (v1, v2) + (aib; — 43b1) f (v1, U3) + (azb; — azb) f (v2, U3) 
a, ay 
b, b, f(v, v3) 


which is the content of Lemma A.4. 


a, a3 a2 43 
bi b; 


by b; 


f(v, v2) + f(v, 03) + 


A.4 Exterior Products 


The following definition applies. 


DEFINITION A.2: Let V be an n-dimensionmal vector space over a field K, and let r be an integer such 
that 1 < r < n. The r-fold exterior product (or simply exterior product when r is 
understood) is a vector space E over K together with an alternating v-linear mapping 
g:V" — E, denoted by g(v,,..., uv.) =v, A... A v,, with the following property: 
(*) For any vector space U over K and any alternating r-linear map f:V" — U 

there exists a unique linear map f*:E — U such that f*. g = f. 
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The r-fold tensor product (E, g) (or simply E when g is understood) of V is denoted by A’ V, and the 
element v; A++- A v, is called the exterior product or wedge product of the vps. 

Another way to state condition (*) is that the diagram in Fig. A-1(c) commutes. Again, the fact that 
such a unique linear map /* exists is called the ‘‘Universal Mapping Principle (UMP)’’. As illustrated in 
Fig. A-1(c), condition (*) also says that any alternating 7-linear map f : V” — U ‘“‘factors through’’ the 
exterior product E = A’ V. Again, the uniqueness in (*) implies that the image of g spans E; that is, 
span(v, A+- Av) = E. 


THEOREM A.5: (Uniqueness of Exterior Products) Let (E, g) and (E’, g’) be r-fold exterior products 
of V. Then there exists a unique isomorphism h:E — E’ such that hg = g’. 

The proof is the same as the proof of Theorem A.1, which uses the UMP. 

THEOREM A.6: (Existence of Exterior Products) Let V be an n-dimensional vector space over K. 
Then the exterior product E = ^" V exists. If r > n, then E = {0}. If r < n, then 


dim E = e Moreover, if [v,, ..., U,] is a basis of V, then the vectors 


V AU Ar AY, 


where 1 <i, <i, <--- <i, <n, form a basis of E. 
We give a concrete example of an exterior product. 


EXAMPLE A.5 (Cross Product) 
Consider V = R? with the usual basis (i, j, k). Let E = N? V. Note dim V = 3. Thus, dim Æ = 3 with basis 
iA j, iA k, j A k. We identify E with R? under the correspondence 
i=j^Ak,j=k^i=-—i^k,k=i^j 
Let u and w be arbitrary vectors in V = R°, say 
u = (a), a, a3) = aji + aj + azk and w = (b, bo, b3) = bii + bj + bsk 
Then, as in Example A.3, 
u ^ w= (aib, — abi )(i A j) + (aib; — 4361) (iA K) + (ab; — azb2)(j ^k) 
Using the above identification, we get 
u ^w = (ab; — a3b,)i — (a,b; — azb; )j + (ab, — ab, )k 


a, a3 
b, b; 


a, a3 


bi b; 


l1— 


The reader may recognize that the above exterior product is precisely the well-known cross product 
in R°. 

Our last theorem tells us that we are actually able to ‘‘multiply’’ exterior products, which allows us to 
form an ‘‘exterior algebra’’ that is illustrated below. 


THEOREM A.7: Let V be a vector space over K. Let r and s be positive integers. Then there is a 
unique bilinear mapping 


NV x Ny = Ar 
such that, for any vectors u;, w, in V, 


(uy A+++ Au,) X (Wp A+++ AW) uy Ars Au, Aw, A +++ AW, 
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EXAMPLE A.6 
We form an exterior algebra A over a field K using noncommuting variables x, y, z. Because it is an exterior algebra, 
our variables satisfy: 


xAx=0, yAy=0, zAz=0, and yAx=-xAy, zAx=-xAz, zAy=-yAz 
Every element of A is a linear combination of the eight elements 
1, x, y, Zz, xAy, xAz, yAz, xAyAz 


We multiply two ‘‘polynomials’’ in A using the usual distributive law, but now we also use the above conditions. For 
example, 


[3 + 4y — 5x Ay + 6x Az] A [5x — 2y] = 15x — 6y — 20x Ay + 12x Ay AZ 
Observe we use the fact that 


[Ay] A [5x] = 20y Ax = —20x ^y and [6xAz] A[—2y] = —l2xAzAy = 12xAyAz 


APPENDIX B 


Algebraic Structures 


B.1 Introduction 


We define here algebraic structures that occur in almost all branches of mathematics. In particular, we 
will define a field that appears in the definition of a vector space. We begin with the definition of a group, 
which is a relatively simple algebraic structure with only one operation and is used as a building block for 
many other algebraic systems. 


B.2 Groups 


Let G be a nonempty set with a binary operation; that is, to each pair of elements a, b € G there is 
assigned an element ab € G. Then G is called a group if the following axioms hold: 


[G,] For any a, b, c € G, we have (ab)c = a(bc) (the associative law). 


[G,| There exists an element e € G, called the identity element, such that ae = ea = a for every 
a€éG. 


[G3] For each a € G there exists an element a! € G, called the inverse of a, such that 
-1 


aa™! = a`la = e. 

A group G is said to be abelian (or: commutative) if the commutative law holds—that is, if ab = ba for 
every a, b E G. 

When the binary operation is denoted by juxtaposition as above, the group G is said to be written 
multiplicatively. Sometimes, when G is abelian, the binary operation is denoted by + and G is said to be 
written additively. In such a case, the identity element is denoted by 0 and is called the zero element; the 
inverse is denoted by —a and it is called the negative of a. 

If A and B are subsets of a group G, then we write 


AB ={ablac A,be€B} or A+B={a+blac A, be B} 


We also write a for {a}. 

A subset H of a group G is called a subgroup of G if H forms a group under the operation of G. If H is 
a subgroup of G and a € G, then the set Ha is called a right coset of H and the set aH is called a left coset 
of H. 


DEFINITION: A subgroup H of G is called a normal subgroup if a~'Ha C H for every a € G. 
Equivalently, H is normal if aH = Ha for every a € G—that is, if the right and left 
cosets of H coincide. 


Note that every subgroup of an abelian group is normal. 


THEOREM B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group under 
coset multiplication. This group is called the quotient group and is denoted by G/H. 


— aD 
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EXAMPLE B.1 The set Z of integers forms an abelian group under addition. (We remark that the even integers 
form a subgroup of Z but the odd integers do not.) Let H denote the set of multiples of 5; that is, 
H = {...,—10, —5, 0, 5, 10,...}. Then H is a subgroup (necessarily normal) of Z. The cosets of H in Z follow: 


+H =H = {...,—10, —5, 0, 5,10,...} 
+H = {...,—9, —4, 1, 6, 11, ...} 


BWI NI = Ol 
II 

A U Ne O 
| 


-H ={..., —8, —3, 2, 7, 12,...} 
=3+H ={..., —7, —2, 3, 8, 13, ...} 
=4+H = {..., —6, —1, 4, 9, 14, ...} 


For any other integer n € Z, ñ =n + H coincides with one of the above cosets. Thus, by the above theorem, 


Z/H = {0, 1, 2, 3, 4} forms a group under coset addition; its addition table follows: 


+ 
So! 
=l 
NI 
Wi 
p5] 


Bw N = Ol 
BIW NI = Ol 
oO Bw NI =l 
= O A wI NI 
Ni = OI A wI 
wi Ni = O A 


This quotient group Z/H is referred to as the integers modulo 5 and is frequently denoted by Z;. Analogeusly, for 
any positive integer n, there exists the quotient group Z, called the integers modulo n. 


EXAMPLE B.2 The permutations of n symbols (see page 267) form a group under composition of mappings; it is 
called the symmetric group of degree n and is denoted by S,,. We investigate S, here; its elements are 


fi. 23 (1 23 ft 23 
om 9: 3 = 4 1 M=(5 3 4 
3 1 2 3 
5) &=(5 7 3) 


Here ( l 7 ) is the permutation that maps 1 +> i, 2 — j, 3 — k. The multiplication table of S, is 


= 

| 
a ™~ 
— = 
Ww N 
NUW 
Sos 
la] 
w 

| 
ATTN 
N Re 
= N N 


€ G, n o Qi Q 

€ JE aonn Qi h 

ajoa € QA h an o 

o| o Q € $i h % 

o|o Pi h € an 

Pil Pi 93 oa n Q € 

b2| $2 oa 93 o € Q 
(The element in the ath row and bth column is ab.) The set H = {e, cı} is a subgroup of S}; its right and left 
cosets are 


Right Cosets Left Cosets 
H = {e, 01} H = {e, 01} 
Ay, a {1, a} oH = {1,03} 
Hy, = {7,03} oH = {5,0} 


Observe that the right cosets and the left cosets are distinct; hence, H is not a normal subgroup of S}. 


A mapping f from a group G into a group G’ is called a homomorphism if f (ab) = f(a)f(b). For every 
a, b € G. (If fis also bijective, i.e., one-to-one and onto, then f is called an isomorphism and G and G’ are 
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said to be isomorphic.) Iff : G — G' is a homomorphism, then the kernel of fis the set of elements of G 
that map into the identity element e’ € G’: 


kernel of f = {a € G| f(a) = &} 
(As usual, AG) is called the image of the mapping f : G — G’.) The following theorem applies. 


THEOREM B.2: Letf: G— G be a homomorphism with kernel K. Then K is a normal subgroup of G, 
and the quotient group G/K is isomorphic to the image of f. 


EXAMPLE B.3 Let G be the group of real numbers under addition, and let G’ be the group of positive real numbers 
under multiplication. The mapping f : G — G’ defined by f(a) = 2% is a homomorphism because 


(a+b) = 2°? = 292? = f(a)f (b) 


In particular, f is bijective, hence, G and G’ are isomorphic. 


EXAMPLE B.4 Let G be the group of nonzero complex numbers under multiplication, and let G’ be the group of 
nonzero real numbers under multiplication. The mapping f : G — G’ defined by f(z) = |z| is a homomorphism 
because 


Ff (2122) = laz] = jall] =f (21) fF (22) 


The kernel K of f consists of those complex numbers z on the unit circle—that is, for which |z| = 1. Thus, G/K is 
isomorphic to the image of that is, to the group of positive real numbers under multiplication. 


B.3 Rings, Integral Domains, and Fields 


Let R be a nonempty set with two binary operations, an operation of addition (denoted by +) and an 
operation of multiplication (denoted by juxtaposition). Then R is called a ring if the following axioms are 
satisfied: 


R,] For any a, b, c € R, we have (a+b) +c=a+(b+c). 


R,| There exists an element 0 € R, called the zero element, such that a+ 0 = 0+ a = a for every 
aER. 


R,| For each a € R there exists an element —a € R, called the negative of a, such that 
a+ (—a) = (—a)+a = 0. 


R,| For any a, b € R, we have a+ b =b +a. 


Rs] For any a, b, c € R, we have (ab)c = a(bc). 


R| For any a, b, c € R, we have 
(i) a(b + c) = ab + ac, and (ii) (b + c)a = ba + ca. 


Observe that the axioms [R,] through [R,] may be summarized by saying that R is an abelian group 
under addition. 

Subtraction is defined in R by a — b = a + (—b). 

It can be shown (see Problem B.25) that a - 0 = 0 - a = 0 for every a E R. 

R is called a commutative ring if ab = ba for every a, b € R. We also say that R is a ring with a unit 
element if there exists a nonzero element 1 € R such that a- 1 = 1 - a = a for every a € R. 

A nonempty subset S of R is called a subring of R if S forms a ring under the operations of R. We note 
that S is a subring of R if and only if a, b € S implies a — b € S and ab E S. 

A nonempty subset J of R is called a left ideal in R if (1) a — b € I whenever a, b € I, and (ii) ra € I 
whenever r € R, a € I. Note that a left ideal Z in R is also a subring of R. Similarly, we can define a right 
ideal and a two-sided ideal. Clearly all ideals in commutative rings are two sided. The term ideal shall 
mean two-sided ideal uniess otherwise specified. 
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THEOREM B.3: Let J be a (two-sided) ideal in a ring R. Then the cosets {a +I |a € R} form a ring 
under coset addition and coset multiplication. This ring is denoted by R/I and is 
called the quotient ring. 


Now let R be a commutative ring with a unit element. For any a € R, the set (a) = {ra |r € R} is an 
ideal; it is called the principal ideal generated by a. If every ideal in R is a principal ideal, then R is called 
a principal ideal ring. 


DEFINITION: A commutative ring R with a unit element is called an integral domain if R has no 
zero divisors—that is, if ab = 0 implies a= 0 or b= 0. 


DEFINITION: A commutative ring R with a unit element is called a field if every nonzero a € R has a 
multiplicative inverse; that is, there exists an element a~! € R such that aa~! = a`la = 1. 
A field is necessarily an integral domain; for if ab = 0 and a Æ 0, then 


b=1-b=a 'ab=a!-0=0 


We remark that a field may also be viewed as a commutative ring in which the nonzero elements form a 
group under multiplication. 


EXAMPLE B.5 The set Z of integers with the usual operations of addition and multiplication is the classical 
example of an integral domain with a unit element. Every ideal / in Z is a principal ideal; that is, Z = (n) for 
some integer n. The quotient ring Z, = Z/(n) is called the ring of integers module n. If n is prime, then Z, is a field. 
On the other hand, if n is not prime then Z, has zero divisors. For example, in the ring Z,, 23 =Oand 
2 Æ 0and3 40. 


EXAMPLE B.6 The rational numbers Q and the real numbers R each form a field with respect to the usual 
operations of addition and multiplication. 


EXAMPLE B.7 Let C denote the set of ordered pairs of real numbers with addition and multiplication defined by 
(a, b)+ (c, d) = (a+c, b+d) 
(a, b) - (c, d) = (ac — bd, ad + bc) 


Then C satisfies all the required properties of a field. In fact, C is just the field of complex numbers (see page 4). 


EXAMPLE B.8 The set M of all 2 x 2 matrices with real entries forms a noncommutative ring with zero divisors 
under the operations of matrix addition and matrix multiplication. 


EXAMPLE B.9 Let R be any ring. Then the set R[x] of all polynomials over R forms a ring with respect to the usual 
operations of addition and multiplication of polynomials. Moreover, if R is an integral domain then R[x] is also an 
integral domain. 


Now let D be an integral domain. We say that b divides a in D if a = bc for some c € D. An element 
u € D is called a unit if u divides 1—that is, if u has a multiplicative inverse. An element b € D is called 
an associate of a € Dif b = ua for some unit u € D. A nonunit p € D is said to be irreducible if p = ab 
implies a or b is a unit. 

An integral domain D is called a unique factorization domain if every nonunit a € D can be written 
uniquely (up to associates and order) as a product of irreducible elements. 


EXAMPLE B.10 The ring Z of integers is the classical example of a unique factorization domain. The units of Z 
are | and —1. The only associates of n € Z are n and —n. The irreducible elements of Z are the prime numbers. 


EXAMPLE B.11 The set D = {a + bv 13a, b integers} is an integral domain. The units of D are 1, 
18 + 5y13and — 18 + 5y13. The elements 2,3 -— v13 and —3 — v13 are irreducible in D. Observe that 
4=2-2= (3 = v13) (-3 — vy 13). Thus, D is not a unique factorization domain. (See Problem B.40.) 
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B.4 Modules 


Let M be an additive abelian group and let R be a ring with a unit element. Then M is said to be a (left) R- 
module if there exists a mapping R x M — M that satisfies the following axioms: 


(r+s)m=rm+sm 


(rs)m = r(sm) 


for any r, s E R and any m; € M. 
We emphasize that an R-module is a generalization of a vector space where we allow the scalars to 
come from a ring rather than a field. 


EXAMPLE B.12 Let G be any additive abelian group. We make G into a module over the ring Z of integers by 
defining 


n times 
oa OF 
ng=gtgt---tg, 0g=0, (—n)g=-—ng 


where n is any positive integer. 
EXAMPLE B.13 Let R be a ring and let J be an ideal in R. Then J may be viewed as a module over R. 


EXAMPLE B.14 Let V be a vector space over a field K and let T: V — V be a linear mapping. We make V into a 
module over the ring K[x] of polynomials over K by defining f(x)v = f(T)(v). The reader should check that a scalar 
multiplication has been defined. 


Let M be a module over R. An additive subgroup N of M is called a submodule of M if u € N and 
k € R imply ku € N. (Note that N is then a module over R.) 

Let M and M’ be R-modules. A mapping T:M — M’ is called a homomorphism (or: R-homomorphism 
or R-linear) if 


(i) T(u+ v) = T(u)+T(v) and (ii) T(ku) = kT(u) 
for every u, v E€ M and every k E R. 


PROBLEMS 


Groups 
B.1. Determine whether each of the following systems forms a group G: 
(i) G = setof integers, operation subtraction; 
Gi) G = {1, —1}, operation multiplication; 
(iii) G = set of nonzero rational numbers, operation division; 
(iv) G = setof nonsingularn x n matrices, operation matrix multiplication; 


(v) G= {a+ bi: a, b € Z}, operation addition. 


B.2. Show that in a group G: 
(i) the identity element of G is unique; 
Gi) each a € G has a unique inverse a~! € G; 
Gii) (a`!) = a, and (ab) != ba"; 


(iv) ab = ac implies b = c, and ba = ca implies b = c. 
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B.3. Ina group G, the powers of a € G are defined by 
a =e, a" =aa"", a” =(a")"', whereneN 


Show that the following formulas hold for any integers r, s,t€Z: (i) da =a’, (ii) (a) = a", 
Gii) (y= ae 


B.4. Show that if G is an abelian group, then (ab)"= a"b” for any a, b € G and any integer n € Z. 
B.5. Suppose G is a group such that (ab)? = a?b? for every a, b € G. Show that G is abelian. 


B.6. Suppose H is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is 
nonempty, and (ii) a, b € H implies ab™! € H. 


B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G. 


B.8. Show that the set of all powers ofa € G is a subgroup of G; it is called the cyclic group generated 
by a. 


B.9. A group G is said to be cyclic if G is generated by some a € G; that is, G = (a” : n € Z). Show 
that every subgroup of a cyclic group is cyclic. 


B.10. Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition 
or to the set Z, (of the integers module n) under addition. 


B.11. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint 
subsets. 


B.12. The order of a group G, denoted by |G|, is the number of elements of G. Prove Lagrange’s 
theorem: If H is a subgroup of a finite group G, then |H] divides |G]. 


B.13. Suppose |G| = p where p is prime. Show that G is cyclic. 


B.14. Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and 
(ii) H AN is a normal subgroup of G. 


B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G. 


B.16. Prove Theorem B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group 
G/H under coset multiplication. 


B.17. Suppose G is an abelian group. Show that any factor group G/H is also abelian. 


B.18. Let f : G — G’ be a group homomorphism. Show that 
(i) f(e) =e’ where e and e’ are the identity elements of G and G’, respectively; 
(ii) f(a!) = f(a)! for any a € G. 


B.19. Prove Theorem B.2: Let f : G — G’ be a group homomorphism with kernel K. Then K is a normal 
subgroup of G, and the quotient group G/K is isomorphic to the image of f. 


B.20. Let G be the multiplicative group of complex numbers z such that |z| = 1, and let R be the additive 
group of real numbers. Prove that G is isomorphic to R/Z. 
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B.21. 


B.22. 


B.23. 


B.24. 


Rings 
B.25. 


B.26. 


B.27. 


B.28. 


B.29. 


B.30. 


B.31. 


B.32. 


For a fixed g € G, let ê : G — G be defined by $(a) = g7 tag. Show that G is an isomorphism of 
G onto G. 


Let G be the multiplicative group of n x n nonsingular matrices over R. Show that the mapping 
A |= |A| is a homomorphism of G into the multiplicative group of nonzero real numbers. 


Let G be an abelian group. For a fixed n € Z, show that the map a +> a” is a homomorphism of G 
into G. 


Suppose H and N are subgroups of G with N normal. Prove that H QN is normal in H and 
H/(H AN) is isomorphic to HN /N. 


Show that in a ring R: 
(i) a-0=0-a=0, (ii) a(—b) = (—a)b = —ab, (iii) (—a)(—b) = ab. 
Show that in a ring R with a unit element: (i) (—1)a = —a, (ii) (—1)(—1) = 1. 


Let R be a ring. Suppose a? 


is called a Boolean ring.) 


= a for every a € R. Prove that R is a commutative ring. (Such a ring 


Let R be a ring with a unit element. We make R into another ring Ê by defininga Pb =a+b+1 
anda -b = ab + a + b. (i) Verify that R is a ring. (11) Determine the 0-element and 1-element of R. 


Let G be any (additive) abelian group. Define a multiplication in G by a- b = 0. Show that this 
makes G into a ring. 


Prove Theorem B.3: Let J be a (two-sided) ideal in a ring R. Then the cosets (a + I |a € R) forma 
ring under coset addition and coset multiplication. 


Let J, and J, be ideals in R. Prove that 7, + J, and J, NJ, are also ideals in R. 


Let R and R’ be rings. A mapping f : R — R' is called a homomorphism (or: ring homomorphism) if 


© f(at+b)=fla)+f(b) and (ii) f(ab) = f(a) f(), 


for every a, b € R. Prove that if f : R — R’ is a homomorphism, then the set K = {r € R| f(r) = 0} is an 
ideal in R. (The set K is called the kernel of f.) 


Integral Domains and Fields 


B.33. 


B.34. 


B.35. 


B.36. 


B.37. 


B.38. 


Prove that in an integral domain D, if ab = ac, a# 0, then b = c. 


Prove that F = {a+ bv2 |a, b rational} is a field. 


Prove that D = {a + bV/2 la, b integers} is an integral domain but not a field. 
Prove that a finite integral domain D is a field. 
Show that the only ideals in a field K are {0} and K. 


A complex number a + bi where a, b are integers is called a Gaussian integer. Show that the set G 
of Gaussian integers is an integral domain. Also show that the units in G are +1 and +i. 
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B.39. Let D be an integral domain and let J be an ideal in D. Prove that the factor ring D/J is an integral 
domain if and only if / is a prime ideal. (An ideal J is prime if ab € I implies a € J or b € I.) 


B.40. Consider the integral domain D= {a + bV13 |a, b integers } (see Example B.11). If 
æ = a + bvV13, we define N(x) = a? — 13b’. Prove: (i) N(aB) = N(«)N(f); Gi) « is a unit if 
and only if N(«) = +1; (iii) the units of D are +1, 18 + 5v13and — 18 + 5/13; (iv) the 
numbers 2, 3 — J/13 and — 3 — v13 are irreducible. 


Modules 


B.41. Let M be an R-module and let A and B be submodules of M. Show that A + B and AM B are also 
submodules of M. 


B.42. Let M be an R-module with submodule N. Show that the cosets {u +N :u € M} form an 
R-module under coset addition and scalar multiplication defined by r(u + N) = ru + N. (This 
module is denoted by M/N and is called the quotient module.) 


B.43. Let M and M’ be R-modules and let f : M — M’ be an R-homomorphism. Show that the set 
K = {u € M : f(u) = 0} is a submodule of f. (The set K is called the kernel of f.) 


B.44. Let M be an R-module and let E(M) denote the set of all R-homomorphism of M into itself. Define 
the appropriate operations of addition and multiplication in E(M) so that E(M) becomes a ring. 


APPENDIX C 


Polynomials over a Field 


C.1 Introduction 


We will investigate polynomials over a field K and show that they have many properties that are 
analogous to properties of the integers. These results play an important role in obtaining canonical forms 
for a linear operator T on a vector space V over K. 


C.2 Ring of Polynomials 


Let K be a field. Formally, a polynomial of f over K is an infinite sequence of elements from K in which 
all except a finite number of them are 0: 


f=(...,0, a, ..., a, a) 


(We write the sequence so that it extends to the left instead of to the right.) The entry a, is called the Ath 
coefficient of f. If n is the largest integer for which a, Æ 0, then we say that the degree of fis n, written 


deg f=n 


We also call a,, the leading coefficient of f, and if a, = 1 we call f a monic polynomial. On the other hand, 
if every coefficient of fis 0 then fis called the zero polynomial, written f = 0. The degree of the zero 
polynomial is not defined. 

Now if g is another polynomial over K, say 


g=(..., 0,6, ..., bi, bo) 

then the sum f + g is the polynomial obtained by adding corresponding coefficients. That is, if m < n, then 
f+g=(...,0, dy, 065 Am tO ---, G +51, do + bo) 

Furthermore, the product fg is the polynomial 
fg =(..., 0, a bm, -+-, aibo + gb, aobo) 

that is, the Ath coefficient c, of fg is 


k 
Ce = So abei = ayby + ayby_) +--+ + agbo 
1=0 


The following theorem applies. 


THEOREM C.1: The set P of polynomials over a field K under the above operations of addition and 
multiplication forms a commutative ring with a unit element and with no zero 
divisors—an integral domain. If f and g are nonzero polynomials in P, then 


deg (fg) = (deg f) (deg g). 
— a 
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Notation 
We identify the scalar ay € K with the polynomial 


ag = (..., 0, ap) 
We also choose a symbol, say t, to denote the polynomial 
t= (.. 0,1,0) 


We call the symbol ż an indeterminant. Multiplying t with itself, we obtain 
l =(...,0,1,0,0), = (...,0, 1,0,0,0), 

Thus, the above polynomial f can be written uniquely in the usual form 
f =a," +--+ +a,t+ ag 

When the symbol ¢ is selected as the indeterminant, the ring of polynomials over K is denoted by 
Kit] 

and a polynomial f is frequently denoted by f(t). 


We also view the field K as a subset of Kt] under the above identification. This is possible because the 
operations of addition and multiplication of elements of K are preserved under this identification: 


(ods 0, ay) +(..., 0, by) =(.-.; 0, dy + bo) 
er 0, ago) (on, 0, bo) = (sei 0, agbo) 
We remark that the nonzero elements of K are the units of the ring Kft]. 
We also remark that every nonzero polynomial is an associate of a unique monic polynomial. Hence, if 


d and d’ are monic polynomials for which d divides d’ and d’ divides d, then d = d’. (A polynomial g 
divides a polynomial f if there is a polynomial A such that f = hg.) 


C.3__Divisibility 


The following theorem formalizes the process known as ‘‘long division.”’ 


THEOREM C.2 (Division Algorithm): Let f and g be polynomials over a field K with g # 0. Then 
there exist polynomials g and r such that 


f=qgtr 
where either r = 0 or deg r < deg g. 
Proof: If f = 0 or if deg f < deg g, then we have the required representation 
f=08+f 
Now suppose deg f > deg g, say 
f=a,t"+-:-+tat+a, and g= bpt” +- + btt bo 


where a,, b,, #0 and n > m. We form the polynomial 


ay —m 
i= = 5 g (1) 
Then deg fi < deg f. By induction, there exist polynomials g, and r such that 
A Sng Tr 
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where either r = 0 or deg r < deg g. Substituting this into (1) and solving for f, 


f= (a +f ler 


which is the desired representation. 


THEOREM C.3: The ring K[f] of polynomials over a field K is a principal ideal ring. If Z is an ideal in 
Kjt], then there exists a unique monic polynomial d that generates Z, such that d 
divides every polynomial f € I. 


Proof. Let d be a polynomial of lowest degree in Z. Because we can multiply d by a nonzero scalar and 
still remain in J, we can assume without loss in generality that d is a monic polynomial. Now suppose 
f €I. By Theorem C.2 there exist polynomials q and r such that 


f =qd-+r where either r = 0 or deg r < deg d 


Now f, d € I implies gd € I, and hence, r = f — qd € I. But d is a polynomial of lowest degree in Z. 
Accordingly, r = 0 and f = qd; that is, d divides f. It remains to show that d is unique. If d’ is another 
monic polynomial that generates J, then d divides d’ and d’ divides d. This implies that d = d', because d 
and d’ are monic. Thus, the theorem is proved. 


THEOREM C.4: Let f and g be nonzero polynomials in K[¢]. Then there exists a unique monic 
polynomial d such that 
(i) d divides f and g; and (ii) d’ divides f and g, then d’ divides d. 


DEFINITION: The above polynomial d is called the greatest common divisor of f and g. If d = 1, 
then fand g are said to be relatively prime. 


Proof of Theorem C.4. The set I = {mf + ng|m,n € K|t]} is an ideal. Let d be the monic polynomial 
that generates 7. Note f, g € I; hence, d divides fand g. Now suppose d’ divides fand g. Let J be the ideal 
generated by d’. Then f, g € J, and hence, J C J. Accordingly, d € J and so d’ divides d as claimed. It 
remains to show that d is unique. If d, is another (monic) greatest common divisor of f and g, then d 
divides d, and d, divides d. This implies that d = d, because d and d, are monic. Thus, the theorem is 
proved. 


COROLLARY C.5: Let d be the greatest common divisor of the polynomials f and g. Then there exist 
polynomials m and n such that d = mf + ng. In particular, if fand g are relatively 


prime, then there exist polynomials m and n such that mf + ng = 1. 


The corollary follows directly from the fact that d generates the ideal 
I = {mf +ng|m,n € K{t}} 


C.4 Factorization 


A polynomial p € K{t] of positive degree is said to be irreducible if p = fg implies f or g is a scalar. 


LEMMA C.6: Suppose p € Kf] is irreducible. If p divides the product fg of polynomials f, g € Kft], 
then p divides f or p divides g. More generally, if p divides the product of n 
polynomials fi f,...f,, then p divides one of them. 


Proof. Suppose p divides fg but not f. Because p is irreducible, the polynomials f and p must then be 
relatively prime. Thus, there exist polynomials m,n € K[¢] such that mf + np = 1. Multiplying this 


Giy— — Appendix C Polynomials over a Field 


equation by g, we obtain mfg + npg = g. But p divides fg and so mfg, and p divides npg; hence, p divides 
the sum g = mfg + npg. 

Now suppose p divides fi f ---f,. If p divides f,, then we are through. If not, then by the above result p 
divides the product f,---f,. By induction on n, p divides one of the polynomials f, ...f,. Thus, the 
lemma is proved. 


THEOREM C.7: (Unique Factorization Theorem) Let f be a nonzero polynomial in K[#]. Then f can 
be written uniquely (except for order) as a product 


f= PPa: Pn 


where k € K and the p; are monic irreducible polynomials in K{¢]. 


Proof. We prove the existence of such a product first. If f is irreducible or if f € K, then such a product 
clearly exists. On the other hand, suppose f = gh where fand g are nonscalars. Then g and h have degrees 
less than that of f. By induction, we can assume 


g= kgg 8g, and h=khyhy---h, 
where k,, ky € K and the g; and h; are monic irreducible polynomials. Accordingly, 
f = (kiky) 8182 °+ + B ky hy ++ hg 


is our desired representation. 
We next prove uniqueness (except for order) of such a product for f. Suppose 


f =kpipr°* Pn = Kad “Am 


where k, k’ € K and the py, ..-, Pns di, ---, qm are monic irreducible polynomials. Now p, divides 
k'q, -+ -qm Because p, is irreducible, it must divide one of the q; by the above lemma. Say p, divides qj. 
Because p, and q; are both irreducible and monic, p; = qı. Accordingly, 


kpo: Py = KQ ++ Am 


By induction, we have that n = m and p = qo, .--,; Pn = 4m for some rearrangement of the g;. We also 
have that k = k’. Thus, the theorem is proved. 

If the field K is the complex field C, then we have the following result that is known as the 
fundamental theorem of algebra; its proof lies beyond the scope of this text. 


THEOREM C.8: (Fundamental Theorem of Algebra) Let f(t) be a nonzero polynomial over the 
complex field C. Then f(t) can be written uniquely (except for order) as a product 


I) Shea = ey) (t= ra) 


where k, r; € C—as a product of linear polynomials. 
In the case of the real field R we have the following result. 


THEOREM C.9: Let f(t) be a nonzero polynomial over the real field R. Then f(t) can be written 
uniquely (except for order) as a product 


S(t) = kpi (t)p2(t) «> P(t) 


where k € R and the p,(t) are monic irreducible polynomials of degree one or two. 


APPENDIX D 


Odds and Ends 


D.1 Introduction 


This appendix discusses various topics, such as equivalence relations, determinants and block matrices, 
and the generalized MP (Moore-Penrose) inverse. 


D.2 Relations and Equivalence Relations 


A binary relation or simply relation R from a set A to a set B assigns to each ordered pair (a, b) € A x B 
exactly one of the following statements: 

(i) ‘‘a is related to b,” written a R b, (ii) ‘‘a is not related to b” written a R b. 
A relation from a set A to the same set A is called a relation on A. 

Observe that any relation R from A to B uniquely defines a subset R of A x B as follows: 


Ê = {(a,b)|aRb} 
Conversely, any subset R of A x B defines a relation from A to B as follows: 

a R b if and only if (a, b) € R 
In view of the above correspondence between relations from A to B and subsets of A x B, we redefine a 
relation from A to B as follows: 


DEFINITION D.1: A relation R from A to B is a subset of A x B. 


Equivalence Relations 


Consider a nonempty set S. A relation R on S is called an equivalence relation if R is reflexive, 
symmetric, and transitive; that is, if R satisfied the following three axioms: 


[E;] (Reflexivity) Every a € A is related to itself. That is, for every a € A, a R a. 

[E2] (Symmetry) If a is related to b, then b is related to a. That is, if a R b, then b R a. 

[E3] (Transitivity) If a is related to b and b is related to c, then a is related to c. That is, 
if a R b and b R c, then a R c. 


The general idea behind an equivalence relation is that it is a classification of objects that are in some way 
“‘alike.’’ Clearly, the relation of equality is an equivalence relation. For this reason, one frequently uses ~ 
or = to denote an equivalence relation. 


EXAMPLE D.1 

(a) In Euclidean geometry, similarity of triangles is an equivalence relation. Specifically, suppose «, B,y are 
triangles. Then (i) « is similar to itself. (ii). If « is similar to f, then f is similar to «. (iii) If « is similar to f and f 
is similar to y, then « is similar to y. 


— oD 
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(b) The relation C of set inclusion is not an equivalence relation. It is reflexive and transitive, but it is not symmetric 
because A C B does not imply B C A. 


Equivalence Relations and Partitions 


Let S be a nonempty set. Recall first that a partition P of S is a subdivision of S into nonempty, 
nonoverlapping subsets; that is, a collection P = {4; } of nonempty subsets of S such that (i) Each a € S 
belong to one of the 4,, (ii) The sets {4,} are mutually disjoint. 

The subsets in a partition P are called tells, Thus, each a € S belongs to exactly one of the cells. Also, 
any element b € A; is called a representative of the cell A;, and a subset B of S is called a system of 
representatives if B contains exactly one element in each of the cells in {4;}. 

Now suppose R is an equivalence relation on the nonempty set S. For each a € S, the equivalence class 
of a, denoted by [a], is the set of elements of S to which a is related: 


[a] = {x |a Rx}. 
The collection of equivalence classes, denoted by S/R, is called the quotient of S by R: 
S/R = {[a]|a € S} 
The fundamental property of an equivalence relation and its quotient set is contained in the following 


theorem: 


THEOREM D.1: Let R be an equivalence relation on a nonempty set S. Then the quotient set S/R is a 
partition of S. 


EXAMPLE D.2 Let = be the relation on the set Z of integers defined by 
= y(mod 5) 


which reads ‘‘x is congruent to y modulus 5’’ and which means that the difference x — y is divisible by 5. 
Then = is an equivalence relation on Z. 
Then there are exactly five equivalence classes in the quotient set Z/ = as follows: 


Ao = {..., —10, —5, 0, 5, 10, ...} 
A, ={..., -9, —4, 1,6, 11, ...} 
A, = E 39.7, V9 neck 
A,={... = ek eee 
A, = ee ees 14, ...} 


Note that any integer x, which can be expressed uniquely in the form x = 5q +r where 0 < r < 5, isa 
member of the equivalence class 4, where r is the remainder. As expected, the equivalence classes are 
disjoint and their union is Z: 


Z = Ao UA, UA, UA; U 44 
This quotient set Z/ =, called the integers modulo 5, is denoted 
Z/5Zor simply Zs. 


Usually one chooses {0, 1, 2, 3, 4} or {—2, —1, 0, 1, 2} as a system of representatives of the equiva- 
lence classes. 
Analagously, for any positive integer m, there exists the congruence relation = defined by 


x = y(mod m) 


and the quotient set Z/ = is called the integers modulo m. 
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D.3 Determinants and Block Matrices 


Recall first: 


THEOREM 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks 
Aj, Ay, ---, An: Then det(M) = det(A,) det(Ay) ...det(A,). 


A D where A is r x r and D is s x s. Then det(M) = det(A) det( D). 


Accordingly, if M = | 0 D 


C D 
is s x s. Then det(M) = det(A) det(D — C4~'B) 


I lle B 


THEOREM D.2: Consider the block matrix M = É a | where A is nonsingular, A is r x r and D 


Proof: Follows from the fact that M = Be 1/10 D-C4-B 


| and the above result. 


D.4 Full Rank Factorization 


A matrix B is said to have full row rank r if B has r rows that are linearly independent, and a matrix C is 
said to have full column rank r if C has r columns that are linearly independent. 


DEFINITION D.2: Let A be am x n matrix of rank r. Then A is said to have the full rank factorization 
A=BC 
where B has full-column rank r and C has full-row rank r. 


THEOREM D.3: Every matrix A with rank r > 0 has a full rank factorization. 


There are many full rank factorizations of a matrix A. Fig. D-1 gives an algorithm to find one such 
factorization. 


Algorithm D-1: The input is a matrix A of rank r > 0. The output is a full rank factorization of A. 

Step 1. Find the row cannonical form M of A. 

Step 2. Let B be the matrix whose columns are the columns of A corresponding to the columns of M 
with pivots. 

Step 3. Let C be the matrix whose rows are the nonzero rows of M. 

Then A = BC is a full rank factorization of A. 


Figure D-1 


1 1 —1 2 1 1 0 1 
EXAMPLE D.3 Let 4 = 2 2 -l1 3 | whereM = |0 0 1 —1 | isthe row cannonical form of A. 
We set -1 -l1 2 -3 0 0 0 
1 -l 
B= 2 -=l and C= yt l 
= 2 0 0 1 -l 


Then A = BC is a full rank factorization of A. 
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D.5 Generalized (Moore-Penrose) Inverse 


Here we assume that the field of scalars is the complex field C where the matrix A! is the conjugate 
transpose of a matrix A. [If A is a real matrix, then 44 = A™.] 


DEFINITION D.3: Let A be an m x n matrix over C. A matrix, denoted by At, is called the 
pseudoinverse or Morre—Penrose inverse or MP-inverse of A if A satisfies the 
following four equations: 

[MP1] AX4 = 4, [MP3] (4X)" = AX, 
[MP2] XAX = X, [MP4] (XA)" = XA, 
Clearly, A* is an n x m matrix. Also, At = A`! if A is nonsingular. 
LEMMA D.4: A* is unique (when it exists). 
Proof. Suppose X and Y satisfy the four MP equations. Then 
AY = (AY)"= (AXAY)"'= (AY)"(AX)"= AYAX = (AYA)X = AX 


The first and fourth equations use [MP3], and the second and last equations use [MP1]. Similarly, 
YA = XA (which uses [MP4] and [MP1]). Then, 


Y = YAY = (YA)Y = (XA)Y = X(AY) = X(AX) = X 
where the first equation uses [MP2]. 


LEMMA D.5: A* exists for any matrix A. 


Fig. D-2 gives an algorithm that finds an MP-inverse for any matrix A. 


Algorithm D-2. Input is an m x n matrix A over C or rank r. Output is A”. 
Ay Aj 
21 422 


r xr block. [Here P and Q are the products of elementary matrices corresponding to the 
interchanges of the rows and columns. ] 


Step 1. Interchange rows and columns of A so that PAQ = | | where A, is a nonsingular 


21 
Step 3. Set A* = olee "(ety 'B" |. 


Step 2. Set B = ve | and C = [I,, Aq Ajo] where Z, is the r x r identity matrix. 


Figure D-2 
Combining the above two lemmas we obtain: 


THEOREM D.6: Every matrix A over C has a unique Moore-Penrose matrix AT. 
There are special cases when A has full-row rank or full-column rank. 


THEOREM D.7: Let A be a matrix over C. 


(a) If A has full column rank (columns are linearly independent), then 
At =(A"A) A". 
(b) If A has full row rank (rows are linearly independent), then A+ = A#(AA#) ©. 


THEOREM D.8: Let A be a matrix over C. Suppose A = BC is a full rank factorization of A. Then 
At = C*B* = C"(ccH) (BM) ‘BH 
Moreover, AAt = BB* and AtA = CTC. 
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EXAMPLE D.4 Consider the full rank factorization A = BC in Example D.1; that is, 
1 1 -l1 2 1 -l 


A=| 2 2 -1 Se ra T e 
si t 2 -3 4 2 
Then 
2 1 
Hy-l_ 1 2 1 Hy-l_ l 2 1 Hp\—! 1 6 5 H 1o l 1 7 4 
(C) Jf E =sli Oe =i eh A “ale 447 
1 2 


Accordingly, the following is the Moore—Penrose inverse of A: 


1 18 15 
olf 1 18 15 
=55|-2 19 25 

3 =f =10 


AT 


D.6 Least-Square Solution 


Consider a system AX = B of linear equations. A least-square solution of AX = B is the vector of 
smallest Euclidean norm that minimizes ||4X — B||,. That vector is 


X SAB 
[In case A is invertible, so At = A~!, then X = A~!B, which is the unique solution of the system.] 
EXAMPLE D.5 Consider the following system AX = B of linear equations: 
x+y—z+2t=1 
2x+2y—z4+3t=3 
—x—yt2z-—3t=2 


Then, using Example D.4, 


1 18 15 
ol 1 18 15 
~ 55|—-2 19 25 
3 —1 —10 


i 1 -1 2 1 
A= 2 2 -l1 3|, Ba Æ 
-1 -1 2 -3 2 
Accordingly, 
X = A*B = (1/55)[85, 85, 105, —20]'= [17/11, 17/11, 21/11, —4/11]" 


is the vector of smallest Euclidean norm which minimizes ||AX — B||,. 


A = [a,j], matrix, 27 
A = (a; ], conjugate matrix, 38 
|A|, determinant, 264, 268 

A*, adjoint, 377 

A", conjugate transpose, 38 

A’, transpose, 33 

At, Moore-Penrose inverse, 418 
Aj, minor, 269 

A(I,J), minor, 273 

A(V), linear operators, 174 

adj A, adjoint (classical), 271 

A ~ B, row equivalence, 72 

A ~ B, congruence, 360 

C, complex numbers, 11 

C”, complex n-space, 13 

C{a, b], continuous functions, 228 

C( f), companion matrix, 304 

colsp (A), column space, 120 

d(u, v), distance, 5, 241 
diag(a,,,..-,@,n), diagonal matrix, 35 
diag(A,,,...,A,,), block diagonal, 40 
det(A), determinant, 268 

dim V, dimension, 124 

{e,,...,e,}, usual basis, 125 

E,, projections, 384 

f :A— B, mapping, 164 

F(X), function space, 114 

Go F, composition, 173 

Hom(V, U), homomorphisms, 174 

i, j, k, 9 

[,, identity matrix, 33 

ImF, image, 169 

J(A), Jordan block, 329 

K, field of scalars, 112 

Ker F, kernel, 169 

m(t), minimal polynomial, 303 
Mm n m x n matrices, 114 


LIST OF SYMBOLS 


n-space, 5, 13, 227, 240 

P(t), polynomials, 114 

P(t), polynomials, 114 

proj(u, v), projection, 6, 234 
proj(u, V), projection, 235 

Q, rational numbers, 11 

R, real numbers, 1 

R”, real n-space, 2 

rowsp (A), row-space, 120 

S+, orthogonal complement, 231 
sgno, sign, parity, 267 
span(S), linear span, 119 

tr(A), trace, 33 

[T];, matrix representation, 195 
T*, adjoint, 377 

T-invariant, 327 

T', transpose, 351 

Iul], norm, 5, 13, 227, 241 
[u],, coordinate vector, 130 

u. v, dot product, 4, 13 

(u, v), inner product, 226, 238 
u X v, cross product, 10 

u ® v, tensor product, 396 

u ^ v, exterior product, 401 
u® v, direct sum, 129, 327 

V = U, isomorphism, 132, 169 
V ® W, tensor product, 396 
V*, dual space, 349 

V**, second dual space, 350 
N V, exterior product, 401 
W°, annihilator, 351 

Z, complex conjugate, 12 
Z(v,T), T-cyclic subspace, 330 
Oi. Kronecker delta, 37 

A(t), characteristic polynomial, 294 
A, eigenvalue, 296 

>>, summation symbol, 29 


A 


Absolute value (complex), 12 
Abelian group, 403 
Adjoint, classical, 271 

operator, 377, 384 
Algebraic multiplicity, 298 
Alternating mappings, 276, 360, 399 
Angle between vectors, 6, 230 
Annihilator, 330, 351, 354 
Associate, 406 
Associated homogeneous system, 83 
Associative, 174, 403 
Augmented matrix, 59 


B 
Back-substitution, 63, 65, 67 
Basis, 82, 124, 139 
change of, 199, 211 
dual, 350, 352 
orthogonal, 243 
orthonormal, 243 
second dual, 367 
standard, 125 
usual, 125 
Basis-finding algorithm, 127 
Bessel inequality, 264 
Biective mapping, 166 
Bilinear form, 359, 396 
alternating, 276 
matrix representation of, 360 
polar form of, 363 
real symmetric, 363 
symmetric, 361 
Bilinear mapping, 359, 396 
Block matrix, 39, 50 
determinants, 417 
Jordan, 344 
square, 40 
Bounded, 156 


C 
Cancellation law, 113 
Canonical forms, 205, 325 
Jordan, 329, 336 
rational, 331 
row, 74 
triangular, 325 
Casting-out algorithm, 128 
Cauchy—Schwarz inequality, 5, 229, 240 


Cayley—Hamilton theorem, 294, 308 
Cells, 39, 415 
Change of basis, 199, 211 
Change-of-basis (transition) matrix, 199 
Change-of-coordinate matrix, 221 
Characteristic polynomial, 294, 305 

value, 296 
Classical adjoint, 271 
Coefficient, 57, 58, 411 

Fourier, 233, 244 

matrix, 59 
Cofactor, 269 
Column, 27 

matrix, 27 

operations, 89 

space, 120 

vector, 3 
Colsp(A), column space, 126 
Commutative law, 403 

group, 113 
Commuting (diagram), 396 
Companion matrix, 304 
Complement, orthogonal, 242 
Complementary minor, 273 
Completing the square, 393 
Complex: 

conjugate, 13 

inner product, 239 

matrix, 38, 49 

n-space, 13 

numbers, 1, 11, 13 

plane, 12 
Complexity, 88 
Components, 2 
Composition of mappings, 165 
Congruent matrices, 360 

diagonalization, 61 
Conjugate: 

complex, 12 

linearity, 239 

matrix, 38 

symmetric, 239 
Consistent system, 59 
Constant term, 57, 58 
Convex set, 193 
Coordinates, 2, 130 

vector, 130 
Coset, 182, 332, 403 
Cramer’s rule, 272 


a 


Cross product, 10 

Curves, 8 

Cyclic subspaces, 330, 342 
group, 408 


D 
Oi» Kronecker delta function, 33 
Decomposable, 327 
Decomposition: 
direct-sum, 129 
primary, 238 
Degenerate, 360 
bilinear form, 360 
linear equations, 59 
Dependence, linear, 133 
Derivative, 168 
Determinant, 63, 264, 267 
computation of, 66, 270 
linear operator, 275 
order, 3, 266 
Diagonal, 32 
blocks, 40 
matrix, 35, 47 
quadratic form, 302 
Diagonal (of a matrix), 10 
Diagonalizable, 203, 292, 296 
Diagonalization: 
algorithm, 299 
in inner product space, 382 
Dimension of solution spaces, 82 
Dimension of vector spaces, 82, 139 
finite, 124 
infinite, 124 
subspaces, 126 
Direct sum, 129, 327 
decomposition, 327 
Directed line segment, 7 
Distance, 5, 241 
Divides, 412 
Division algorithm, 412 
Domain, 164, 406 
Dot product, 4 
Dual: 
basis, 350, 352 
space, 349, 352 


E 
Echelon: 

form, 65, 72 

matrices, 70 
Eigenline, 296 
Eigenspace, 299 
Eigenvalue, 296, 298, 312 
Eigenvector, 296, 298, 312 
Elementary divisors, 331 
Elementary matrix, 84 
Elementary operations, 61 

column, 86 

row, 72, 120 


Index 


Elimination, Gaussian, 67 
Empty set, Ø, 112 
Equal: 

functions, 164 

matrices, 27 

vectors, 2 
Equations (See Linear equations) 
Equivalence: 

classes, 416 

matrix, 87 

relation, 73, 415 

row, 72 
Equivalent systems, 61 
Euclidean n-space, 5, 228 
Exterior product, 401 


F 
Field of scalars, 11, 406 
Finite dimension, 124 
Form: 
bilinear, 359 
linear, 349 
quadratic, 363 
Forward elimination 63, 67, 73 
Fourier coefficient, 81, 233 
series, 233 
Free variable, 65, 66 
Full rank, 41 
factorization, 417 
Function, 154 
space F(X), 114 
Functional, linear, 349 
Fundamental Theorem of Algebra, 414 


G 

Gaussian elimination, 61, 67, 73 
Gaussian integer, 409 

Gauss—Jordan algorithm, 74 

General solution, 58 

Geometric multiplicity, 298 
Gram-Schmidt orthogonalization, 235 
Graph, 164 

Greatest common divisor, 413 

Group, 113, 403 


H 
Hermitian: 

form, 364 

matrix, 38, 49 

quadratic form, 364 
Hilbert space, 229 
Homogeneous system, 58, 81 
Homomorphism, 173, 404, 407 
Hom(V, U), 173 
Hyperplane, 7, 358 


I 
i, imaginary, 12 
Ideal, 405 


Index 


Identity: 
mapping, 166, 168 
matrix, 33 
ijk notation, 9 
Image, 164, 169, 170 
Imaginary part, 12 
Im F, image, 169 
Im z, imaginary part, 12 
Inclusion mapping, 190 
Inconsistent systems, 59 
Independence, linear, 133 
Index, 30 
Index of nilpotency, 328 
Inertia, Law of, 364 
Infinite dimension, 124 
Infinity-norm, 241 
Injective mapping, 166 
Inner product, 4 
complex, 239 
Inner product spaces, 226 
linear operators on, 377 
Integral, 168 
domain, 406 
Invariance, 224 
Invariant subspaces, 224, 326, 332 
direct-sum, 327 
Inverse image, 164 
Inverse mapping, 164 
Inverse matrix, 34, 46, 85 
computing, 85 
inversion, 267 
Invertible: 
matrices, 34, 46 
Irreducible, 406 
Isometry, 381 
Isomorphic vector spaces, 169, 404 


J 
Jordan: 
block, 304 
canonical form, 329, 336 


K 
Ker F, kernel, 169 
Kernel, 169, 170 


Kronecker delta function ð;;, 33 


ij? 
L 
h-space, 229 
Laplace expansion, 270 
Law of inertia, 363 
Leading: 
coefficient, 60 
nonzero element, 70 
unknown, 60 
Least square solution, 419 
Legendre polynomial, 237 
Length, 5, 227 
Limits (summation), 30 
Line, 8, 192 


Linear: 


— o 


combination, 3, 29, 60, 79, 115 


dependence, 121 
form, 349 
functional, 349 
independence, 121 
span, 119 


Linear equation, 57 
Linear equations (system), 58 


consistent, 59 
echelon form, 65 
triangular form, 64 


image, 164, 169 
kernel, 169 
nullity, 171 
rank, 171 


Linear operator: 


adjoint, 377 
characteristic polynomial, 
determinant, 275 


Linear mapping (function), 164, 167 


304 


on inner product spaces, 377 


invertible, 175 
matrix representation, 195 


M 
Minn. matrix vector space, 114 
Mappings (maps), 164 


bilinear, 359, 396 
composition of, 165 
linear, 167 

matrix, 168 


Matrices: 


congruent, 360 
equivalent, 87 
similar, 203 


Matrix, 27 


augmented, 59 
change-of-basis, 199 
coefficient, 59 
companion, 304 
diagonal, 35 
echelon, 65, 70 
elementary, 84 
equivalence, 87 
Hermitian, 38, 49 
identity, 33 
invertible, 34 
nonsingular, 34 
normal, 38 
orthogonal, 237 
positive definite, 238 
rank, 72, 87 

space, M,,,,, 114 
square root, 296 
triangular, 36 


Linear transformation (See linear mappings), 167 
Located vectors, 7 
LU decomposition, 87, 104 


=> 


Matrix mapping, 165 
Matrix multiplication, 30 


Matrix representation, 195, 238, 360 


adjoint operator, 377, 384 
bilinear form, 359 
change of basis, 199 
linear mapping, 195 
Metric space, 241 
Minimal polynomial, 303, 305 
Minkowski’s inequality, 5 
Minor, 269, 273 
principle, 273 
Module, 407 
Monic polynomial 303, 411 
Moore-Penrose inverse, 418 
Multilinearity, 276, 399 
Multiplicity, 298 
Multiplier, 67, 73, 87 


N 
n-linear, 276 
n-space, 2 

complex, 13 

real, 2 
Natural mapping, 351 
New basis, 199 
Nilpotent, 328, 336 
Nonnegative semidefinite, 226 
Nonsingular, 112 
linear maps, 172 
matrices, 34 
Norm, 5, 227, 241 
Normal, 7 
matrix, 38 
operator, 380, 383 
ormalized, 227 
ormalizing, 5, 227, 233 
ormed vector space, 241 
ullity, 171 
nullsp(A), 170 
Null space, 170 


N 
N 
N 
N 


O 

Old basis, 199 

One-norm, 241 

One-to-one: 
correspondence, 166 
mapping, 166 

Onto mapping (function), 166 


Operators (See Linear operators) 


Order, n: 
determinant, 264 
of a group, 408 

Orthogonal, 4, 37, 80 
basis, 231 
complement, 231 
matrix, 237 
operator, 380 
projection, 384 
substitution, 302 


Orthogonalization, Gram—Schmidt, 235 
Orthogonally equivalent, 381 
Orthonormal, 233 

Outer product, 10 


P 
Parameter, 64 
form, 65 
Particular solution, 58 
Partition, 416 
Permutations, 8, 267 
Perpendicular, 4 
Pivot, 67, 71 
row reduction, 94 
variables, 65 
Pivoting (row reduction), 94 
Polar form, 363 
Polynomial, 411 
characteristic, 294, 305 
minimum, 303 
space, P,,(t), 114 
Positive definite, 226 
matrices, 238 
operators, 336, 382 
Positive operators, 226 
square root, 391 
Primary decomposition theorem, 
328 
Prime ideal, 410 
Principle ideal ring, 406 
Principle minor, 273 
Product: 
exterior, 401 
inner, 4 
tensor, 396 
Projections, 167, 234, 344, 384 
orthogonal, 384 
Proper value, 296 
vector, 296 
Pythagorean theorem, 233 


Q 
Q, rational numbers, 11 
Quadratic form, 301, 315, 363 
Quotient 

group, 403 

ring, 406 

spaces, 332, 416 


R 
R, real numbers, 1, 12 
R”, real n-space, 2 
Range, 164, 169 
Rank, 72, 87, 126, 171, 364 
Rational: 
canonical form, 331 
numbers, Q, 11 
Real: 
numbers, R, 1 
part (complex number), 12 


Index 


Index 


Real symmetric bilinear form, 363 
Reduce, 73 
Relation, 415 
Representatives, 416 
Restriction mapping, 192 
Right-handed system, 11 
Right inverse, 189 
Ring, 405 
quotient, 406 
Root, 293 
Rotation, 169 
Row, 27 
canonical form, 72 
equivalence, 72 
operations, 72 
rank, 72 
reduce, 73 
reduced echelon form, 73 
space, 120 


S 
S„, symmetric group, 267, 404 
Scalar, 1, 12 
matrix, 33 
multiplication, 33 
product, 27 
Scaling factor, 296 
Schwarz inequality, 5, 229, 240 
(See Cauchy—Schwarz inequality) 
Second dual space, 350 
Self-adjoint operator, 380 
Sign of permutation, 267 
Signature, 364 
Similar, 203, 224 
Similarity transformation, 203 
Singular, 172 
Size (matrix), 27 
Skew-adjoint operator, 380 
Skew-Hermitian, 38 
Skew-symmetric, 360 
matrix, 36, 48 
Solution, (linear equations), 57 
zero, 121 
Spatial vectors, 9 
Span, 116 
Spanning sets, 116 
Spectral theorem, 383 
Square: 
matrix, 32, 44 


system of linear equations, 58, 72 


Square root of a matrix, 391 
Standard: 
basis, 125 
form, 57, 399 
inner product, 228 
Subdiagonal, 304 
Subgroup, 403 
Subset, 112 
Subspace, 117, 133 
Sum of vector spaces, 129 


— o 


Summation symbol, 29 
Superdiagonal, 304 
Surjective map, 166 
Sylvester’s theorem, 364 
Symmetric: 

bilinear form, 361 

matrices, 4, 36 
Systems of linear equations, 58 


T 
Tangent vector, T(t), 9 
Target set, 164 
Tensor product, 396 
Time complexity, 88 
Top-down, 73 
Trace, 33 
Transformation (linear), 167 
Transition matrix, 199 
Transpose: 
linear functional (dual space), 351 
matrix, 32 
Triangle inequality, 230 
Triangular form, 64 
Triangular matrix, 36, 47 
block, 40 
Triple product, 11 
Two-norm, 241 


U 
Unique factorization domain, 406, 414 
Unit vector, 5, 227 

matrix, 33 
Unitary, 38, 49, 380 
Universal mapping principle (UMP), 396 
Usual: 

basis, 125 

inner product, 228 


Vv 
Vandermonde determinant, 290 
Variable, free, 65 
Vector, 2 
coordinates, 130 
located, 7 
product, 10 
spatial, 9 
Vector space, 112, 226 
basis, 124 
dimension, 124 
Volume, 274 


W 
Wedge (exterior) product, 401 


Z 

Z, integers, 406 

Zero: 
mapping, 128, 168, 173 
matrix, 27 
polynomial, 411 
solution, 121 
vector, 2 


